I'm sure "Bard" was primarily a Shakespeare reference (The Bard of Avon, frequently just The Bard), and I liked it too. An appropriate name for a technology that's all about language.
Gemini sounds cool and sci-fi though, and maybe it's a bit easier to localize since it's just straight Latin.
To me, bard just sounds phonetically gross. Reminds me of “fart” or “beard.” It calls to mind medieval stuff: the Monte Python mud scene, Skyrim’s most annoying NPCs, plucking lutes. But Gemini? That sounds like a legendary space mission; this collective engineering push against the boundaries of human knowledge.
When I hear "bard", I think of this guy from the Asterix comics first: https://asterix.com/en/portfolio/cacofonix/ - who is notorious for getting on everyone's nerves with his constant singing.
> We are not talking here about the rain he brings on each time exercises his vocal cords, but rather about the prevailing atmosphere in the village: when it is time to party, when wild boar are roasting on the spit, you can be sure to find Cacofonix tied hand and feet with a gag in his mouth.
I remember when the iPad was announced, and everyone said that people would only ever think of feminine products when they heard the name. It might have been true for a few months, but now it seems quaint that we ever had such concerns.
Bard is really funny to me to make fun of. It feels like the discount version of ChatGPT. Like the way that (ironically) TV shows would get microsoft sponsoring and the characters would say "oh you should Bing that", a phrase no human would normally say, and I like to be "ah let me see what Bard thinks about this".
Understand that this is not condesending in any way, as I do not have this experience.
If there are these "Feelings" around these words, how is any sentence correctly taken at face value. How does one communicate to these people the direct and correct meaning of the terms used.
For example, sentence sounds like seance, do they feel like i'm asking the spirits of the dead ?
Correct sounds like wrecked, do they assume that everything is broken in the above sentence.
Is communication with fraught with unknown minefields of unintended emotions and misundestandings ?
Not at all, these "unintended" emotions can be ignored for the most part. But if you ask me, then google is in my foreigner ear one of the stupidest brand name I know of due to its phonetical ressemblance with some words from my native tongue.
Bards were the people who kept history and genealogy before written history. Think like Homer rather than Shakespeare. I think the name was meant more to evoke the idea that the AI is a repository of all linguistic knowledge in the same way that the bard was. And maybe also the idea that the AI was at your service in the same way the bard was at the service of the ruler.
It's not a bad name, but personally when I first heard the name Bard I chuckled because LLMs had already come under so much criticism for their tendency to embellish the truth or say stuff that is just straight up false but sounds cool.
> The story concerns [...] an old Bard, a child's computer whose sole function is to generate random fairy tales. The boys download a book about computers into the Bard's memory in an attempt to expand its vocabulary, but the Bard simply incorporates computers into its standard fairy tale repertoire.
"Gemini" must refer to its inherently multimodal origins?
It's not a text-based LLM that was later adapted to include other modalities. It was designed from the start to seamlessly understand and work with audio, images, video and text simultaneously. Theoretically, this should give it a more integrated and versatile understanding of the world.
The promise is that multimodality baked in from the start, instead of bolting image recognition on to a primarily text-based LLM, should give it superior reasoning and problem-solving capabilities. It should excel at complex reasoning tasks to draw inferences, create plans, and solve problems in areas like math and programming.
I don't know if that promise has been achieved yet.
In my testing so far, Gemini Advanced seems equivalent to ChatGPT 4 in most of my use cases. I tested it on the last few of days worth of programming tasks that I'd solved with ChatGPT 4, and in most cases it returns exactly what I wanted on the first response, compared with the a lengthy back-and-forth required with ChatGPT 4 arrive at the same result.
But when analyzing images Gemini Advanced seems overly sensitive and constantly gives false rejections. For example, I asked it to analyze a Chinese watercolor and ink painting of a pagoda-style building amidst a flurry of cherry blossoms, with figures ascending a set of stairs towards the building. ChatGPT 4 gave a detailed response about its style, history, techniques, similar artists, etc. Gemini refused to answer and deleted the image because it detected people in the image, even though they were very small, viewed from the back, no faces, no detail whatsoever.
In my (limited) testing so far, I'd say Gemini Advanced is better at analyzing recent events than ChatGPT 4 with Bing. This morning I asked each of them to describe the current situation with South Korea possibly acquiring a nuclear deterrent. Gemini's response was very current and cited specific statements by President Yoon Suk-yeol. Even after triggering a Bing search to get the latest facts, the ChatGPT 4 response was muddy and overly general, with empty and obvious sentences like "pursuing a nuclear weapons program would confront significant technical, diplomatic, and strategic challenges".
It seems odd to me that would work better necessarily considering that humans evolved different capabilities many millennia apart and integrated them all with intelligence comparatively late in the evolutionary cycle. So it’s not clear that multimodal from the get go is a better strategy than bolting on extra modalities over time. It could be though since technology is built differently from evolution but interesting to consider
The constellation Gemini gets its name directly from the Greek mythological twins, Castor and Pollux.
Each twin had different capabilities. Pollux was a powerful warrior while Castor was an intellectual tactician.
The twins possessed an extraordinary fraternal bond, each loyal and devoted to protecting the other.
Together, they accomplished what they couldn't do individually. Their combined strengths made them far more effective than either could be alone.
Just as text, images, audio and video convey different knowledge, relationships and reasoning than text by itself, their combined strengths in a single model should be more powerful than any model trained on only one modality.
Gemini sounds cool and sci-fi though, and maybe it's a bit easier to localize since it's just straight Latin.