Global News

Moshi by Kyutai: A New Era in AI Chatbots That Can Recognize Emotions!

It appears that French AI startup Kyutai has created a new chatbot named "Moshi" using AI, which is modeled off GPT-4o. Moshi is a voice-activated artificial intelligence assistant that Kyutai, a French artificial intelligence firm, reportedly unveiled. It appears that Kyutai Labs introduced a fresh take on AI chatbots following the negative reception that OpenAI's chatGPT received. It makes sense that Moshi would be positioned as a competitor to OpenAI's GPT-4o. According to early reports, many users did not find the GPT-4o's "voice mode" to be satisfactory. In the middle of this, Kyutai's Moshi AI chatbot appears to offer a better "voice mode." (Source: Google Images) According to Kyutai, Moshi speaks with a variety of accents. Furthermore, it is anticipated that Moshi will possess approximately seventy distinct emotional and speech patterns. Even two audio streams can be handled simultaneously by the AI. This implies that you should anticipate an AI chatbot that converses "human-like." (Source: Google Images) Additional Updates ~ Kyutai appears to have stated that over 100,000 artificial dialogues created with Text-to-Speech (TTS) technology were refined during the creation of Moshi. Moreover, Kyutai intends to assist Moshi in learning the subtleties and tones of human communication. It's also said that the brand worked with a professional voice actor to improve Moshi's vocals. Moshi is intended to facilitate realistic voice discussions with consumers, like to those offered by Google Assistant or Alexa. Nevertheless, the Helium 7B model powers Moshi. In a video demonstration, Kyutai demonstrated Moshi's talents. The Kyutai team engaged with Moshi throughout the presentation to show it as a mentor or friend. How the display embodied characters in roleplays further demonstrated its inventiveness.

12 Jul 2024

Global News

Google's Parent Company Alphabet Introduces Revolutionary AI Agent, Project Astra

This is one major revolution in the era of artificial intelligence! Alphabet, the parent company of Google, has revealed an artificial intelligence agent capable of answering real-time inquiries in video, audio, and text as part of a series of projects aimed at demonstrating its AI capabilities and addressing allegations that it has fallen behind competitors. Collaborative Step With Open AI ~ (Source: Google Images) During an annual developer conference on Tuesday, CEO Sundar Pichai showcased the Silicon Valley giant's new "multimodal" AI helper, Project Astra, which is driven by an enhanced version of its Gemini model. Astra was one of a sequence of announcements that showcased Google's new AI-centric vision. It follows product introductions and updated AI models from Big Tech rivals such as Meta, Microsoft, and its partner OpenAI. In a video presentation, Google's prototype AI assistant replied to voice instructions by analysing what it saw through a phone camera or smart glasses.It correctly detected code sequences, proposed modifications to electrical circuit schematics, recognised the King's Cross district of London through the camera lens, and reminded the user where they had left their glasses. According to Pichai, Google intends to include Astra's skills into its Gemini app and other products this year. However, he stressed that, while the ultimate "goal is to make Astra seamlessly available" throughout the company's products, it will be pushed out slowly and "the path to productization will be quality-driven.". Sir Demis Hassabis, leader of DeepMind's AI research branch, stated that reducing reaction time to conversational levels is a significant engineering task. "It is amazing to see how far AI has come, especially when it comes to spatial understanding, video processing, and memory." Google Also Announced Significant Modifications to its Core Search Engine ~ (Source: Google Images) Beginning this week, all US users will get an "AI Overview"—a quick AI-generated summary answer to the question—at the top of many frequent search results, followed by clickable links intermingled with ads below it. The business also stated that the search engine would be able to answer complicated inquiries utilising multi-step reasoning, which means that the AI agent will be able to make numerous separate conclusions in order to accomplish a task, as well as assist users in creating search queries via voice and video. Liz Reid, head of Google Search, stated that the goal was to "remove some of the legwork in search" and that the AI overview will be made available to people in other regions of the world later this year. The adjustments occur as OpenAI challenges Google's search business.The San Francisco-based start-up's ChatGPT chatbot gives rapid and comprehensive responses to numerous topics, threatening to replace standard search results that include a list of links and advertisements. OpenAI has also struck agreements with media groups to include current information to improve its replies. Further Updates Announcements Happened This Week~ (Source: Google Images) Recently, OpenAI unveiled a quicker and less expensive version of the model that drives ChatGPT, potentially in a bid to outdo Google's announcements. ChatGPT can equally translate speech, video, pictures, and code inside a single interface. Along with these new and enhanced AI technologies, Google also unveiled Veo, which uses text prompts to make videos; Imagen 3, which produces images; and Lyria, a model for AI music production. Gemini Advanced subscribers will be able to design custom chatbots, or "Gems," to assist with particular tasks.Additionally improved is the company's flagship Gemini 1.5 Pro model. With 2 million tokens, it now has a significantly bigger context frame. refers to the quantity of information, like code or pictures, that it can access to produce a response; this allows it to follow complex instructions and retrieve information from past exchanges. “I believe AI is going to change the world more than anything in the history of humanity. More than electricity.” ~ Kai-Fu Lee, AI Expert, Chairman & CEO of Sinovation Ventures, Author of 'AI Superpowers' and 'AI 2041'

17 May 2024

Global News

Star Citizens

Meet Prafulla Dhariwal, Indian Prodigy Behind OpenAI's ChatGPT

Recently, OpenAI’s latest GPT-4o model has wowed both the AI community and the general public alike — the model has been seen to be a significant advancement over OpenAI’s last state-of-the-art GPT-44 model. But even as independent researchers are putting GPT-4o through the paces on AI benchmarks, the model has been revealed to have an Indian connection. What Does the Headline Say ~ Prafulla Dhariwal, a native of Pune, is recognized by OpenAI CEO Sam Altman for developing GPT-4o. He wrote on X, "Without Prafulla Dhariwal's vision, talent, conviction, and determination over a long period, GPT-4o would not have happened." "It (together with the efforts of numerous others) resulted in what I hope will prove to be a revolution in computer usage," he continued. Representing India At A Global Spectrum ~ (Source: Google Images ) Let's Rollback Into His Life ~ Although Prafulla Dhaliwal, 28, is an Indian native, he has been employed with OpenAI since 2017. He was a young prodigy who, while attending school in India in 2009, took home a gold medal from the International Astronomy Olympiad. In 2011, he started going to the Dr. Kalmadi Shyamrao High School in Pune. He had constructed and graded problem sets in number theory and geometry, taught algebra, functional equations, and inequalities, and prepared Pune's pupils for the Indian National Mathematical Olympiad while he was still in school. He had taken the IIT-JEE in 2013, just like millions of other Indian students, and had received a rank of 165. However, he had enrolled in MIT in the US rather than one of the IITs. He had earned a Bachelor of Science in Computer Science from MIT and had worked as an intern at D.E. Shaw and Pinterest, two of the industry's leading organizations. While at MIT, Dhariwal worked on CNN and RNN-based models for learning invariant representations for voice classification in the Center for Brain, Minds, and Machines and at the Computer Vision Group, which employed Deep Learning to understand images. Prafulla Dhariwal began working at OpenAI as a research intern in 2016 and subsequently became a research scientist in 2017. His Contribution Along With Many Ambitious Minds ~ He identifies himself as the co-creator of GPT-3, DALL-E 2, and other OpenAI products on his X profile, suggesting that he has contributed significantly to some of the company's largest releases over the years. And now, Sam Altman has acknowledged him as having played a key role in the development of OpenAI's most recent model, GPT-4o. Other Indian researchers have contributed significantly to the present AI revolution besides Prafulla Dhariwal. The transformer concept, which forms the basis of many of the most recent developments in artificial intelligence, was first described in the groundbreaking "Attention is all you need" paper co-authored by two Indian researchers at Google, Ashish Vaswani and Niki Parmar. Meanwhile, Anand Srinivas is the creator of Perplexity AI, an incredibly popular AI startup that many have compared to a Google rival. Even though India is still catching up to other countries in terms of domestic AI startups like Sarvam AI and Krutrim, Indian-origin researchers and founders are leading the charge in the AI revolution. Furthermore, Prafulla has significantly contributed to the development of sophisticated AI models, including 'Glow', which produces high-resolution images rapidly, the Variational Lossy Auto-encoder, which guards against problems in autoencoders, PPO (Proximal Policy Optimisation), which uses reinforcement learning, and GamePad, which uses reinforcement learning to prove formal theorems. Now that OpenAI's model can have lifelike, in-the-moment speech conversations, it looks like music generation is the next area of focus for the technology, and Dhariwal will surely be at the center of it all. On a concluding note, ~ India is recognized globally for its scientific rigor and potential. After all, this is the land of Ayurveda, the land of climate sensitivity demonstrated through the Chipko Movement back in the 1970s, the land where successful nuclear tests like Pokhran-II were conducted, the land where science maestros C.V. Raman (Nobel Prize in Physics 1930) and Anna Mani were born. Thanks to passionate minds like Prafulla, we are getting back the reigns of international representation and planning the years to regain our prosperity encouraging people to work towards scientific and technological advancements. In all these years, India witnessed a massive shift in these fields of discovery by strategically aligning its skills and resources.

17 May 2024

Star Citizens

The Whys and Hows

Creating Video from Text: 'Sora' is Driven With Futuristic Features !

OpenAI has presented 'Sora', an AI model that transforms text suggestions into realistic films. It can create complicated scenes, interpret language, and turn still photos into movies. Researchers, artists, and filmmakers have been provided access to evaluate and provide criticism. OpenAI, the company behind ChatGPT, revealed 'Sora' on Thursday, a new artificial intelligence model capable of converting text instructions into realistic movies. 'Sora' can create videos after receiving directions from users on the style and content of the clip. In addition to producing films from text prompts, OpenAI stated in a blog post that it can animate still images. First and Foremost: Addressing the striking Futurism in AI ( ARTIFICIAL INTELLIGENCE ) ~ AI now serves individuals all across the world as a teacher, mentor, friend, and much more. With the capacity to "think" and the answers to practically any query, it has remarkably passed several tests designed to gauge a person's mentality and way of thinking. The revolution in AI is here! AI has great potential for the future, bringing with it improvements in information access, education, healthcare, and transportation. It will also produce people who are solution-oriented and call for greater technological knowledge. However, it is impossible to dismiss ethical worries about how AI will affect society, including challenges with privacy and data protection as well as discriminatory and biased viewpoints. Concerns are also raised by the responsibility of incorrect or error-based outputs and subsequent human behaviors. AI will be used increasingly often in businesses to support various departments and client interactions. Growth in the healthcare sector is also anticipated. The use of AI is projected to accelerate, raising the possibility of authorities and rules for accountability as well as ethical problems. Techniques will eliminate prejudice and provide openness in how it is used to handle data, What can Sora do? According to OpenAI, Sora is capable of creating complicated scenarios with several actors, certain sorts of motion, and exact topic and backdrop elements. The model knows not just what the user requested in the prompt, but also how those items exist in the real world. "We're teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction," a spokesperson for OpenAI stated. The model understands English well, allowing it to effectively grasp instructions and create interesting characters who exhibit colorful emotions, according to the report. Sora may also use an existing still image to create a movie. In addition, the model may expand or fill in missing frames from an existing video. The model has a thorough comprehension of language, allowing it to correctly read cues and create fascinating characters who communicate strong emotions. Sora may also make several shots inside a single-created video that perfectly represent the characters and visual style. Who has access to Sora thus far? Sora is still under development, and OpenAI has allowed access to academics, visual artists, designers, and filmmakers to analyze important areas for damages or hazards. Sora is now accessible to red teamers to examine crucial regions for potential injury or risk. Access has been provided to several visual artists, designers, and filmmakers to gather feedback on how to improve the model so that it is most useful for creative professions. The team of OpenAI is publishing their research accomplishments early so that we may begin collaborating with and receiving comments from people outside of OpenAI, as well as giving the public an idea of what AI capabilities are on the horizon. Is there any flaw in the new AI model? OpenAI has said that the present Sora model contains flaws. It may struggle to effectively simulate the physics of a complicated scene and may not comprehend precise examples of cause and effect. The model may also misinterpret spatial aspects of a cue, such as left and right, and struggle with detailed descriptions of actions that occur over time, such as following a specified camera trajectory. OpenAI's Safety Initiatives ~ It was stated that Sora is currently unavailable to the public because OpenAI is taking precautions to ensure its safety before incorporating it into its products. For instance, once integrated into an OpenAI product, the text classifier will be responsible for verifying and rejecting text input prompts that violate the use terms, such as those soliciting excessive violence, sexual content, hostile imagery, resemblance to celebrities, or the use of other people's IP addresses. Additionally, powerful image classifiers have been developed to scrutinize the frames of every video produced to ensure compliance with usage standards before user display. Efforts will be made to engage legislators, educators, and artists globally to better understand their concerns and identify beneficial applications for this new technology. Despite extensive research and testing, it was acknowledged that they cannot foresee all the positive applications of their technology, nor can they anticipate all potential forms of misuse. It was emphasized that learning from real-world applications is deemed crucial in the development and delivery of increasingly secure AI systems over time. Other Text-to-Video Models to look out for ~ Sora was not the first video-generating model. Last year, Meta introduced additional AI-powered functionality to its image-generating model Emu, which can edit and make films from text prompts. Meanwhile, earlier this year, Google unveiled Lumiere, a new AI-powered application that employs generative AI to create films from basic text inputs. Sora serves as a foundation for models that can understand and simulate the real world, a capability OpenAI believes would be an important milestone for achieving success!

16 Feb 2024

The Whys and Hows

#OpenAI

Global News

Moshi by Kyutai: A New Era in AI Chatbots That Can Recognize Emotions!

Google's Parent Company Alphabet Introduces Revolutionary AI Agent, Project Astra

Star Citizens

Meet Prafulla Dhariwal, Indian Prodigy Behind OpenAI's ChatGPT

The Whys and Hows

Creating Video from Text: 'Sora' is Driven With Futuristic Features !