I won't comment on the first bit as i've not personally tested in that are but G...

rjh29 · on March 28, 2023

I have tested it :) I've asked it to translate English fictional text into Japanese, it falls over often. It's unnatural and often makes no sense at all. It doesn't compare to a typical professional translation (which are often not that idiomatic either), let alone a really good one.

I'm sure it'll be doing that in five years, but not now.

One interesting thing is that's it's nondeterministic, so sometimes 'For chrissakes' turns to ちくしょう (Damn!) but sometimes to クリスのために (for Chris' sake). Sometimes 'the goddamn door' turns into クソドア ('shit door'), sometimes the goddamn changes the phrasing of the whole sentence instead. If you run it five times and take the best sentences out of all five runs it's probably quite good. Maybe prompting would help too, I said "idiomatic Japanese" but it still usually translated it in a very "foreigner Japanese" way typical of US drama/movie translations.

bumbledraven · on March 28, 2023

Are you giving it multiple paragraphs to translate at once so that it has enough context for a good translation? If so, would you mind sharing a sample input and output that you found unsatisfactory?

In "Can GPT-4 translate literature?" (Mar 18, 2023) [https://youtu.be/5KKDCp3OaMo?t=377], Tom Gally, a former professional translator and current professor at the University of Tokyo, said:

> …the point is, to my eye at least, the the basic quality of the translation [from Japanese to English] is as good as a human might do, and with some relatively mild editing by a sensitive human editor, it could be turned into a novel that would be readable and enjoyable.

rjh29 · on March 28, 2023

I don't think we disagree. The video says the translation will be "readable" but needs several days of an experienced editor passing over it. That's an amazing result, but again, it's not as good as a human yet. It's way faster and it'll make media accessible to tons of people.

Like he says, there's lots of ambiguity in Japanese that needs to be handled, gender not being specified until later, etc. and an editor would need to spend time going over it - but it saves months of traditional work. There are words and _concepts_ that are hard to translate, there are cultural issues, dialects, slang, registers. So yeah it'll make the media accessible, but it won't be as a good as a skilled translator.

bumbledraven · on March 28, 2023

He didn't say it was merely "readable"; he said (as I quoted in GP) "the basic quality of the translation is as good as a human might do."

raincole · on March 28, 2023

Last night I used GPT-4 to translate the first several pages of Ted Chiang's Lifecycle of Software Objects (a sci-fi piece) from English to Chinese. I'd say it's about as good as me, save a few minor errors. It's safe to say it performs better than a "tired me", and some translators I've seen on the market.

I'm a native speaker of Chinese, but not a professional translator.

famouswaffles · on March 28, 2023

Mind sharing output ?

I mean i can if you want (chinese though) but enough people lie on the internet.

kolinko · on March 28, 2023

It may depend on a language. For Polish - which is considered one of the most difficult languages due to various forms of words, it works almost perfect - on par with average human translators.

lowefk · on March 28, 2023

Here is GPT-4's translation, and I find no issues: https://imgur.com/a/oOtf4RD

nico · on March 28, 2023

> I don't think people realize how good Bilingual LLMs are at translations

This.

GPT/ChatGPT is able to even translate between different "accents" or dialects of the same language. For example it can give you the same sentence in Mexican, Argentinean or Peruvian Spanish.

Example:

Me: Give me a sentence in spanish that is different in Mexican, Argentinean and Peruvian Spanish. Write the sentence in each dialect.

ChatGPT: Mexican Spanish: "¿Qué onda, güey? ¿Cómo estás?" Argentinean Spanish: "¿Qué onda, boludo? ¿Cómo estás?" Peruvian Spanish: "¿Qué tal, causa? ¿Cómo estás?"

These sentences mean "What's up, dude? How are you?" in English. The primary difference is the slang term used for "dude" or "friend" in each dialect: "güey" in Mexican Spanish, "boludo" in Argentinean Spanish, and "causa" in Peruvian Spanish.

zquzra · on March 28, 2023

I don't recommend you call random people "boludo" in Argentina.

nico · on March 28, 2023

It really depends on the tone and context. If you are a tourist and say it in a joking manner, people are probably going to laugh. If you say it in anger to someone, they might not like it very much.

Similar to how a lot of swear words work in many languages.

It's interesting to see how what matters is not the word, but the intention behind it. At the end we are trying to communicate meaning, and words are just one of our tools to do it.

gwoolhurme · on March 28, 2023

I am also multilingual as well and I've tested it personally. English <-> Portuguese does really well, but Portuguese <-> Japanese or even Japanese <-> English is not as good as a human translator by a long shot because of a lot of hidden subtext in conversation. Even something that a university student would probably pickup on in their first year of Japanese as a foreign language. It is still much better than GPT-3.5, so much so that it made a lot of waves here in Japan, but a few friends who work in translation of books and manga find it is not really a go-to tool yet (yet...).

famouswaffles · on March 28, 2023

Oh for sure i don't mean to say it's excellent in every language. But i personally think a lot of that is training data representation. Doesn't need to be anywhere equal but for instance after English(93%), the biggest language representation for GPT-3's training corpus is french at...1.8 % Pretty wild.

Of Course i don't know the data for GPT-4

gwoolhurme · on March 28, 2023

I am sure it will improve even further as you pointed out the languages outside of English are fairly low in data represented. However, I guess you said you speak Chinese correct? How well does it do with certain things like older poetic Chinese hanzi? In Japanese if there is a string of kanji it tends to mess up the context. Another area of Japanese it seems poorest at is keigo or polite business Japanese. The way you speak to a superior is almost a different language. So I unfortunately still can't use GPT-4 to help me with business emails (yet).

famouswaffles · on March 28, 2023

I didn't try with old poetic stuff. Passages sampled from 5 books released in the last 2 decades. You can see what I did thoroughly here. Before GPT-4. Basically a comparison between GLM-130b (English/Chinese model) vs Deepl, Google chatGPT(3.5) etc https://github.com/ogkalu2/Human-parity-on-machine-translati...

Mandarin isn't the second language I speak but I officially compared with it because I wanted to test also with a model that had more equivalent corpus training than the very lopsided gpt models. And Chinese/English is the only combo that has a model of note in that regard.

maeil · on March 28, 2023

What language pairs are you talking about? I don't think people realize just how much the difficulty level and the state of technology differ depending on that choice.

famouswaffles · on March 28, 2023

English/Chinese is what i've tried on.

and you can see talk on English/Japanese here - https://youtu.be/5KKDCp3OaMo?t=377