•
August 8, 2024
•
5 mins
Hello, Niuralogists!
This week, our edition dives into the dynamic world of artificial intelligence, aiming to spotlight the latest breakthroughs. We're dedicated to examining the profound implications of these advancements across various domains, from workplaces and businesses to policies and personal interactions. Featured updates include OpenAI’s rollout of ChatGPT’s voice feature and Apple’s decision to use Google chips for AI infrastructure, bypassing Nvidia, and more.
For a more in-depth understanding, keep on reading…
OpenAI has begun a limited rollout of its highly anticipated ‘Advanced Voice Mode’ for paying ChatGPT Plus users, offering natural, real-time conversations and the ability for the AI to detect and respond to emotions. Initially available to a small group of users, the feature will be accessible to all Plus users by fall 2024. Utilizing GPT-4o, Advanced Voice Mode can sense emotions in users' voices, including sadness, excitement, or even singing. Video and screen-sharing capabilities, previously showcased in early demos, will launch at a later date. OpenAI has already sent email instructions to the initial ‘Alpha’ group selected for early access. This marks a significant shift in AI from a tool we text or prompt to an intelligence we can collaborate with, potentially revolutionizing customer service and mental health support by understanding and responding to emotions in real-time.
A recent report revealed that Apple has chosen to use Google chips for its AI infrastructure, bypassing industry leader Nvidia. Apple will use Google chips instead of Nvidia's GPUs to power its AI-related features and tools. This strategic move aims to minimize Apple's hardware dependency, as Nvidia has already captured a significant market share in this domain. The decision was disclosed in a research paper detailing Apple's AI model training process, which utilizes Google's tensor processing units (TPUs) organized in large clusters. Specifically, Apple is deploying 2,048 TPUv5p chips and 8,192 TPUv4 processors for its server-side models. Unlike Nvidia’s independently available GPUs, Google’s TPUs are accessible only through the Google Cloud Platform. This news comes as Apple begins rolling out AI-powered features to beta users, including an enhanced Siri, email summarization, and AI-driven dictation. Despite this significant development, Apple’s stock saw only a marginal dip, reflecting a cautious investor response. This announcement follows Apple’s annual developer conference, where it showcased new AI capabilities and the integration of OpenAI’s ChatGPT technology.
Meta has introduced Segment Anything Model 2 (SAM 2), an advanced AI model designed to identify and track objects across video frames in real time, representing a significant advancement in video AI technology. SAM 2 extends Meta's previous image segmentation capabilities to video, effectively addressing challenges such as fast movement and object occlusion. This model can segment any object in a video and create cutouts with just a few clicks, with a free demo available for users. Meta is also open-sourcing the model and releasing an extensive, annotated database of 50,000 videos used for training. Potential applications of SAM 2 include video editing, mixed reality experiences, and scientific research. The model's real-time tracking capabilities could simplify complex video editing tasks, such as object removal or replacement, to a single click. This release follows Meta's recent launch of Llama 3.1, reinforcing the company's strategy of delivering substantial AI breakthroughs while making them freely accessible.
Chinese tech giant Baidu has unveiled a breakthrough in artificial intelligence that enhances the reliability and trustworthiness of language models. Researchers at Baidu have developed a novel “self-reasoning” framework enabling AI systems to critically evaluate their own knowledge and decision-making processes, addressing the issue of factual accuracy in large language models. Detailed in a paper published on arXiv, this approach improves the reliability and traceability of retrieval-augmented language models (RALMs) through a multi-step process involving relevance-aware, evidence-aware selective, and trajectory analysis processes. This mechanism allows AI to verify and contextualize information, moving beyond simple retrieval to critical assessment of its outputs. Baidu's model achieved performance comparable to GPT-4 with significantly fewer training samples, suggesting a path to developing highly capable AI systems with less data. This innovation represents a significant step toward creating more trustworthy AI, crucial for industries that require high reliability and transparency in decision-making processes.
Apple's highly anticipated AI features, collectively known as Apple Intelligence, will be delayed and will not be included in the initial release of iOS 18 and iPadOS 18 this September. Instead, these features are now scheduled for rollout in later updates, with a planned release by October. Developers will have early access to test Apple Intelligence features through the iOS 18.1 and iPadOS 18.1 betas starting this week. However, some advanced Siri features will also be absent from the October update, with a full rollout expected to extend into 2025. This news coincides with Apple's recent commitment to developing safe, secure, and trustworthy AI, as signed with the White House. While the delay may disappoint fans, it reflects Apple's commitment to delivering stable and polished features. The initial demos presented at WWDC in June now appear more like teasers, given the extended timeline for the full release.
📬 Receive our amazing posts straight to your inbox. Get the latest news, company insights, and Niural updates.
MIT startup Striv, which participated in the START.nano accelerator program, has developed an advanced shoe sole designed to enhance athletic performance by tracking force, movement, and form. This innovative technology, tested by Olympic and professional athletes, integrates tactile sensors with algorithms to provide precise performance insights. Striv’s founder, Axl Chen, aims to use the 2024 Olympics as a testing ground for their technology, which will eventually be available to the general public. The startup, which has received significant support from MIT’s resources, plans to expand its technology to various sports and everyday runners.
MIT researchers have introduced a novel calibration method called Thermometer to address the challenge of large language models (LLMs) being overconfident in their wrong predictions. Traditional calibration techniques often fall short for LLMs due to their broad application across diverse tasks. The Thermometer method, developed by MIT and the MIT-IBM Watson AI Lab, employs a smaller auxiliary model to refine the confidence of LLMs without extensive computational costs. This technique improves the model’s response accuracy and efficiency by predicting the optimal "temperature" for calibration, which aligns confidence levels with actual accuracy. Unlike traditional methods that require task-specific datasets, Thermometer generalizes across tasks, making it adaptable to new applications with minimal additional data. This innovation aims to enhance the reliability of LLMs, providing clearer indicators of when users can trust their predictions.
🎥 PixVerse V2 is a multi-frame video generation with consistent style and subjects
🎬 Nolan is an AI script-writing software for filmmakers
🚀 Comfy Deploy turns ComfyUI workflows into production-ready APIs
🎓 Llama Tutor learns faster with an AI tutor powered by Llama 3.1
🔒 Gandalf tests your prompting skills and gain insights into securing AI