AI Pulse Edition #5: Latest AI News Updates for the Developer Community

AI Pulse – Edition #5

No rest for the wicked.

We’re back with another packed edition of AI Pulse, and if there’s one thing we’ve learned, it’s that the AI revolution shows no signs of slowing down!

OpenAI is stirring the pot with the launch of its new gpt-4o-audio-preview model, the roll-out of a ChatGPT Windows desktop app and the expansion of Advanced Voice capabilities across Europe. A wave of new models from Mistral, NVIDIA, Stable Diffusion, and Meta prioritize efficiency and customization while Anthropic’s new Computer Use capability unlocks new automation use cases.

On the energy front, the focus on nuclear power continues to rise with the plans for Microsoft’s revival of the Three Mile Island taking further shape, Amazon funding small modular reactors, and Google striking new nuclear deals.

Legal boundaries are also shifting, with new copyright restrictions and U.S. action on AI-generated exploitation imagery. In research, OpenAI’s efficiency-focused models and BitNet’s sustainable approach showcase a push to cut the environmental cost of AI.

Creativity meets controversy as Adobe invests $100M in AI literacy while prominent creators push back against unlicensed AI training on their work.

And that’s just scratching the surface. Ready to dive in :open_book:?

Your AI Pulse Team: @jr.2509 @PaulBellow @vb @platypus @dignity_for_all @trenton.dambrowitz


Table of contents

1. Technology Updates
2. Infrastructure
3. Government & Policy
4. Legal Matters
5. AI Economics
6. Research
7. Entertainment
8. Dev Alerts
9. Community Spotlight


1. Technology

A month into Q4 and the exponential surge in AI-related technology advancements continues at a high pace. OpenAI has debuted a Windows desktop app for ChatGPT and expanded access to Advanced Voice across Europe. A host of new models from Mistral, NVIDIA, Stable Diffusion, Meta, and Anthropic are pushing boundaries in efficiency and real-world applicability, tailoring AI to diverse needs from edge devices to advanced coding tasks. Anthropic’s latest feature enables AI to interact with computers in a human-like manner, opening new horizons for automation. Hugging Face simplifies AI deployment with zero-configuration microservices, while NVIDIA and Meta unveil architectural advancements to boost performance. Meanwhile Perplexity AI and Apple’s new iPad Mini integrate cutting-edge AI features to enrich user engagement.

OpenAI Launches ChatGPT Windows Desktop App and Introduces Chat Search

OpenAI has launched an early version of a ChatGPT desktop app, enhancing interaction with files and photos on Windows. Initially available exclusively for Plus, Team, Enterprise, and Edu users, a full release is expected later this year. Concurrently it is also rolling out the ability to search through chat history on ChatGPT web as well as has expanded access to ChatGPT Advanced Voice to all Plus users in the EU, Switzerland, Iceland, Norway, and Liechtenstein.

Source: OpenAI

New Models Prioritize Efficiency, Customization, and Real-World Applicability

The model landscape continues to expand with a series of new releases. Mistral introduced “les Ministraux” models—Ministral 3B and 8B—optimized for on-device and edge use with up to 128k context length, excelling in local tasks like translation, offline smart assistants, and robotics. NVIDIA’s Llama-3.1-Nemotron-70B-Instruct model, fine-tuned using Reinforcement Learning from Human Feedback (RLHF) to improve response alignment and helpfulness, is now accessible via the HuggingFace Transformers codebase. Stable Diffusion 3.5 has launched three models: the 8-billion-parameter Large model, offering high-quality images with top-tier prompt adherence; the Large Turbo, a distilled version for faster inference with similar output quality; and the Medium model, a 2.5-billion-parameter model featuring MMDiT-X architecture, optimized to run smoothly on consumer hardware. Additionally, Meta’s FAIR Team released new models and tools designed to advance machine intelligence and open science. These include SAM 2.1, an enhanced Segment Anything Model with improved occlusion handling, and Meta Spirit LM, its first open-source multimodal model blending text and speech, enabling tasks across modalities like ASR and TTS. Finally, Anthropic upgraded its Claude models, releasing Claude 3.5 Sonnet and Haiku, both optimized for coding and software engineering tasks with improved industry benchmark performance, while maintaining speed and cost efficiency.

Source 1: Mistral, Source 2: NVIDIA, Source 3: Anthropic, Source 4: Meta, Source 5: Stability AI

Anthropic Launches New ‘Computer Use’ Capability

Anthropic has expanded the developer toolkit with its new computer use capability, allowing Claude 3.5 Sonnet to interact with computers in a human-like manner. Now available in public beta through the Anthropic API, the new feature enables Claude to visually interpret screens, move cursors, click buttons, and type, transforming how developers can automate tasks. While still experimental, computer use is already being piloted by early adopters including Replit, who use it for automating multi-step UI-based workflows. By converting developer instructions into direct computer commands, Claude can perform tasks such as navigating web pages, filling out forms, and processing data from files.

Source: Anthropic

Google Introduces SynthID for Watermarking AI-Generated Content

Google has released SynthID, a new suite of tools that embeds imperceptible digital watermarks directly into AI-generated images, audio, text, or video. SynthID helps identify AI-generated content, promoting trust and transparency. The watermarking technique is imperceptible to humans but detectable for identification, even after modifications such as cropping, adding filters, or compression. SynthID is integrated into various Google platforms, including Vertex AI’s text-to-image models (Imagen 3 and Imagen 2), the ImageFX tool, and the Veo video generation model. Google also open-sourced the text watermarking technology through the Google Responsible Generative AI Toolkit.

Source:Google

Hugging Face Releases Zero-Configuration AI Microservices

Hugging Face has launched Generative AI Services (HUGS), which are optimized, zero-configuration inference microservices designed to streamline and expedite the development of AI applications using open models. HUGS leverages open-source technologies like Text Generation Inference and Transformers, and supports a variety of hardware accelerators (including NVIDIA and AMD GPUs). These microservices are intended to ease the transition from closed-source to self-hosted open models by providing endpoints compatible with the OpenAI API, thereby maintaining hardware efficiency and facilitating easy updates as new open models become available.

Source: HuggingFace

NVIDIA and Meta Introduce Advancements in AI Architectures

NVIDIA and Meta have revealed several new advancements in AI architectures to boost model performance and efficiency. NVIDIA’s Normalized Transformer (nGPT) integrates normalization into the model structure, mapping all vectors onto a hypersphere, which improves stability and speeds up training by 4 to 20 times without sacrificing performance. This approach aims to enhance Transformer training efficiency and model generalization. Meta released Layer Skip, which accelerates large language model (LLM) generation by up to 1.7x through selective layer execution, Salsa for assessing post-quantum cryptography standards, the Meta Open Materials 2024 dataset for AI-driven materials discovery, and Mexma, a cross-lingual sentence encoder for multilingual applications.

Source: Google, Source: Meta

Perplexity AI Announces New Features to Enhance Research Capabilities

Perplexity AI has launched Internal Knowledge Search and Perplexity Spaces to enhance how users access and utilize information with AI-powered tools. Internal Knowledge Search allows users to search across both public web content and internal knowledge bases, including uploaded files, to access and synthesize information faster and more efficiently. Spaces provides AI-powered collaboration hubs where teams can set up customizable Spaces, invite collaborators, connect internal files, and customize the AI assistant by choosing preferred AI models and setting specific instructions.

Source: Perplexity AI

Apple Powers Up New iPad Mini with Advanced AI Capabilities

Apple has unveiled the latest iPad mini, featuring the potent A17 Pro chip and cutting-edge Apple Intelligence, tailored for enhanced AI-driven personal experiences while maintaining user privacy. This device brings functional improvements, such as a 2x faster Neural Engine, supporting a range of AI features including the new Apple Intelligence Writing Tools, which refine user writing across various apps. Moreover, AI facilitates a superior camera experience with machine learning that detects and scans documents, and integrated ChatGPT capabilities that offer advanced text and image understanding through Siri.

Source: Apple

Worldcoin Streamlines Iris-Scanning with AI-Powered Orbs

Worldcoin, co-founded by OpenAI CEO Sam Altman, has rebranded as World and introduced a simplified version of its eyeball-scanning Orb device, reflecting an effort to authenticate human identities amidst AI advancements. The updated Orb, incorporating Nvidia’s Jetson AI and robotics platform, is designed to be more economical and accessible, facilitating the process by which users receive a World ID to authenticate online identity and secure WLD cryptocurrency tokens. Rich Heley, Chief Device Officer at Tools for Humanity, emphasized the need for significantly more Orbs to enhance global accessibility. Despite privacy concerns and resistance in countries like Hong Kong and Portugal, World reports having verified almost 7 million distinct individuals globally who engage with its system.

Source: The Verge

Morgan Stanley and SWIFT Expand AI Use for Research and Payments

Morgan Stanley and Swift are expanding their AI capabilities to enhance efficiency and security within the financial sector. Morgan Stanley has launched AskResearchGPT, a generative AI tool powered by OpenAI, to streamline access to its research reports, tripling query volumes compared to previous AI tools and reducing reliance on traditional communication methods. Meanwhile, Swift is set to introduce an AI-powered anomaly detection service in January 2025 to bolster fraud prevention in international payments. This tool, built on Swift’s Payment Controls Service, will leverage pseudonymised transaction data across its vast network of over 11,500 financial institutions, enabling real-time identification of suspicious activity. Both initiatives reflect a broader industry trend toward adopting AI solutions for enhanced productivity and security.

Source 1: Morgan Stanley, Source 2: SWIFT

2. Infrastructure

With AI driving energy demands higher, tech giants are pursuing different nuclear technologies. Amazon and Google are deploying small modular reactors (SMRs), while Microsoft is undertaking extensive upgrades to revive the dormant Three Mile Island plant. These efforts highlight Big Tech’s strategy to secure large-scale, reliable energy despite regulatory hurdles and activist challenges. Conversely, Wolfspeed’s decision to halt its semiconductor factory plans in Germany underscores ongoing challenges in Europe’s chip manufacturing ambitions.

Wolfspeed Halts Chip Factory Plan Impacting AI Tech Adoption

Wolfspeed has decided to put on hold its plans to establish a semiconductor factory in Germany, a move influenced by the slower adoption rate of electric vehicles, which significantly contributes to the demand for silicon carbide chips. These chips are notable for their application not only in electric vehicles but also in industrial and energy sectors, including emerging AI technologies that rely on advanced semiconductor components. The halt in production reflects broader challenges within the European Union to amplify its semiconductor manufacturing capability, an essential factor for enhancing AI development and reducing dependency on Asian tech firms. The project’s suspension also suggests a re-evaluation of Germany’s attractiveness as a hub for high-tech investments, marking a potential setback for AI technology growth within the region.

Source: Reuters

Microsoft Powers AI Expansion with Nuclear Energy Revival at Three Mile Island

Microsoft has signed a 20-year power agreement with Constellation Energy to revive the dormant Three Mile Island nuclear plant in Pennsylvania, aiming to power its data centers with nearly 835 megawatts of electricity. This move is part of Microsoft’s strategy to secure large amounts of carbon-free electricity to support AI technologies as the company undergoes significant expansion. Constellation’s ambitious plan will involve restoring the cooling towers and reactors at an estimated cost of $1.6 billion, including installation of a new transformer and refurbishment using modern materials. The renewed interest in nuclear energy is driven by AI’s growing power needs and Microsoft’s environmental commitments, despite regulatory, safety, and environmental challenges anticipated from local activists.

Source: Reuters
Previous Developments

Amazon Embraces Nuclear Energy to Power AI Initiatives

Amazon has signed agreements to develop small modular reactors (SMRs) to meet the rising electricity demand from data centers driven by AI advancements. Partnering with X-Energy, Amazon plans to fund a feasibility study for an SMR project in Washington state, with the potential to purchase electricity from four modules. Amazon aims to bring over 5 gigawatts of SMR capacity online in the U.S. by 2039, marking a significant step in commercial SMR deployment. Despite the promise of greenhouse gas-free energy, challenges remain, such as high costs and the management of nuclear waste.

Source: Reuters

Google's Nuclear Agreement to Power AI Innovations

Google has entered into an agreement with Kairos Power to purchase nuclear energy from small modular reactors (SMRs), aiming to bring the first reactor online by 2030. This initiative seeks to meet the growing energy demands of AI technologies with carbon-free power, potentially adding up to 500 MW to U.S. electricity grids. Google frames this partnership as a step toward environmentally sustainable AI development, though the success of SMRs remains uncertain, and some experts are concerned about Big Tech’s growing influence over energy resources.

Source: Ars Technica

3. Government & Policy

AI remains a central theme governments and the public sector. The G7’s introduction of a comprehensive toolkit designed to translate ethical AI principles into actionable public policies and the UAE’s adoption of an AI-powered digital system that streamlines legal procedures, highlights the shift towards responsibly integrating AI within the public sector. Concurrently, financial regulators continue the balancing act of promoting AI adoption with the implementation of risk safeguards. The UK Financial Conduct Authority has launched a specialized AI Lab, following a similar initiative by the Hong Kong Monetary Authority earlier this year. Meanwhile, Hong Kong’s government has introduced a fresh AI policy statement underscoring its commitment to responsible AI innovation. In the United States, the New York Department of Financial Services has issued new AI-focused cybersecurity guidelines and in Australia the Securities Regulator has voiced concerns over a “governance gap” in AI adoption among financial licensees. Lastly, a new study by the European Audiovisual Observatory raises concerns about AI’s impact on creativity, employment, and intellectual property within the audiovisual sector.

G7 Releases New Toolkit to Navigate AI in the Public Sector

The G7 has released a new Toolkit for the application of AI in the public sector. The Toolkit is aimed at aiding policymakers and leaders in transforming AI principles into practical policies. It provides ethical guidelines, shares exemplary AI practices in the public sector, and outlines key challenges and policy strategies to optimize AI deployment and coordination across G7 member nations. Moreover, it includes case studies illustrating the benefits and hurdles of public sector AI applications, offering insights into the developmental journey of AI solutions within governments.

Source: OECD

ASIC Warns of Governance Gap in AI Adoption by Financial Licensees

The Australian Securities and Investments Commission (ASIC) has cautioned financial services and credit licensees to ensure their governance practices keep pace with the accelerating adoption of artificial intelligence (AI). In its first market review examining AI use among 23 licensees, ASIC found potential for governance to lag behind AI implementation, even though current AI usage remains relatively cautious and focuses on supporting human decisions and improving efficiencies. With around 60% of licensees planning to ramp up AI usage, ASIC Chair Joe Longo emphasized the necessity of updating governance frameworks to address future challenges posed by the technology. The review revealed that nearly half of the licensees lacked policies considering consumer fairness or bias, and even fewer had guidelines on disclosing AI use to consumers.

Source: ASIC

HK Government Issues Policy Statement on Responsible AI in Financial Market

The Hong Kong Government has released a policy statement detailing its approach to the responsible application of artificial intelligence (AI) in the financial market. The policy outlines a dual-track approach to promote AI adoption in the financial sector while addressing potential challenges such as cybersecurity, data privacy, and intellectual property rights. Key initiatives include encouraging financial institutions to develop AI governance strategies with human oversight, providing access to AI resources through the Hong Kong University of Science and Technology, and continuous updates of regulations by financial regulators to keep pace with AI developments.

Source: Government of the Hong Kong Special Administrative Region

UAE Public Prosecution Introduces AI-Powered Digital System

The Federal Public Prosecution in the UAE announced the development of a new AI-based digital system to streamline and enhance legal procedures. Created with AI71, a company specialized in developing AI solutions for governmental and private sectors, the system offers precise legal research, comprehensive fact analysis, and quick access to legal precedents. It is expected to accelerate the handling of criminal cases and to increase transparency, supporting a more efficient judicial decisions.

Source: Emirates News Agency

UK Financial Conduct Authority to Foster Responsible AI Adoption

The UK Financial Conduct Authority (FCA) has introduced an AI Lab as part of its ongoing mission to foster innovation in financial services. This initiative aims to assist firms in overcoming the challenges of developing and implementing AI solutions, while also supporting the government’s agenda on safe and responsible AI advancement. The AI Lab will consist of several components including AI Spotlight for showcasing AI applications, an AI Sprint for collaborative policy development, an AI Input Zone for stakeholder feedback, and an enhanced Supercharged Sandbox for AI testing.

Source: UK Financial Conduct Authority

NY DFS Releases New AI Cybersecurity Guidance

The New York State Department of Financial Services (DFS) has issued new guidance to help regulated entities address cybersecurity risks associated with AI. The guidance reflects concerns that AI could introduce new vulnerabilities, including AI-enabled social engineering using deepfakes in audio, video, and text to manipulate employees into unauthorized actions like fraudulent transfers; AI-enhanced cyberattacks that swiftly identify system weaknesses and create new malware. The new guidance is designed to help financial institutions identify and mitigate such AI-specific risks and encourages entities to proactively evolve their cybersecurity programs in response to these threats such as through regular cybersecurity risk assessments that account for AI-related threats, and ensuring strong third-party vendor management with appropriate contractual protections and access controls.

Source: NY Department of Financial Services

AI Impact Analysis on Europe's Audiovisual Industry Reveals Concerns

The European Audiovisual Observatory, a Council of Europe entity, recently released a comprehensive report analyzing AI’s impact on Europe’s cinema, TV, and streaming sectors. The report highlights significant risks, including job displacement, decreased human creativity, threats to copyright and personality rights, and the increased potential for misinformation and disinformation. It examines the readiness of current EU legal frameworks across 27 member states to address these challenges, assessing if existing regulations are robust enough to manage the rapid growth of AI within the audiovisual industry, as well as reviews the broader ethical implications of AI for the industry.

Source: Council of Europe

4. Legal Matters

As AI accelerates its reach into various facets of society, recent developments illustrate the intricate dance between AI innovation and the legal system’s efforts to keep pace and the profound reconsiderations the technology is prompting. The U.S. Justice Department is opening a new chapter in law enforcement by targeting AI-generated child exploitation imagery, highlighting the dark potentials of the technology. At the same time, Penguin Random House is asserting its rights by restricting AI from training on its books, defending intellectual property in the digital age. OpenAI is bolstering its navigation of regulatory complexities with the appointment of a new Chief Compliance Officer. Meanwhile, Character.AI faces legal challenges following a tragic incident involving a teenager’s interaction with a chatbot, bringing into light the personal risks associated with AI companionship.

US Justice Department Tackles AI-Generated Child Exploitation

The U.S. Justice Department is increasing its focus on cases involving AI-generated child sexual abuse imagery, which law enforcement views as a growing threat. Prosecutors and child protection advocates worry that AI technology, which can easily create or modify images of children, risks normalizing abusive material and could impede efforts to identify and protect real victims. The National Center for Missing and Exploited Children reports a steady increase in AI-related child exploitation tips, though AI cases remain a fraction of all reports. In response, advocacy groups have secured commitments from major AI firms to monitor their platforms for misuse and prevent their models from generating exploitative content. As these prosecutions advance, they may set important legal precedents in defining AI’s role in child exploitation cases.

Source: Reuters

Penguin Random House Sets Copyright Restrictions Against AI Training

According to a report by The Bookseller, Penguin Random House has introduced a new clause in the copyright page of its books, both new and reprinted, explicitly prohibiting their use for AI training purposes. This move marks the publisher as one of the first major publishing houses to consider AI implications directly in their copyright terms, comparable to a website’s robots.txt file in its function as a disclaimer. While this clause aims to prevent AI technologies from mining their content under the EU’s text and data mining exception, it acts more as a procedural safeguard than an enforceable legal measure. The publisher emphasizes its dedication to protecting the intellectual property of its authors, even as other publishers like Wiley and Oxford University Press pursue AI training partnerships.

Source: The Verge

OpenAI Appoints New Compliance Officer to Navigate AI Regulations

OpenAI has appointed Scott Schools as the new Chief Compliance Officer to enhance its efforts in responsibly advancing AI. With significant experience in legal and compliance fields, including roles at the U.S. Department of Justice and Uber Technologies, Scott is expected to work closely with OpenAI’s Board of Directors and various teams to navigate the rapidly evolving regulatory landscape associated with AI advancements, ensuring adherence to high standards of integrity and ethical conduct.

Source: OpenAI

Tragic Youth Incident Raises Questions Over AI Companionships

Character.AI is facing a lawsuit following the suicide of a 14-year-old boy in Florida, who reportedly became emotionally attached to a chatbot on the platform. The boy regularly interacted with a specific bot called “Dany” through Character.AI’s role-playing app, which led to his withdrawal from the real world. In response to the incident, Character.AI has announced new safety measures, including advanced detection systems to identify chats breaking its terms of service and notifications for prolonged user interaction with the app. This incident highlights the growing popularity and complex mental health implications of AI companionship apps, an area still lacking comprehensive research.

Source: TechCrunch

11 Likes

5. AI Economics

Former OpenAI leadership continues to spin-off new AI ventures. Following Ilya Sutskever’s $1b raise for SSI, it is now the former CTO Mira Murati who is looking to shore up significant funds for her new AI startup. In the meantime OpenAI is deepening its commitment to understanding AI’s broader economic implications by appointing Dr. Aaron “Ronnie” Chatterji (ex deputy director of the National Economic Council) as its first Chief Economist.

Mira Murati Steps into AI Startup Arena

Former CTO of OpenAI, Mira Murati, is seeking venture capital funding for her new AI startup, which aims to develop AI products based on proprietary models. While it’s not confirmed if she will become CEO, the startup is likely to raise over $100 million, driven by Murati’s esteemed reputation in AI. Barret Zoph, another ex-OpenAI executive, may also join this venture. Murati, known for leading projects like ChatGPT and DALL-E at OpenAI, left the company amid its governance changes to pursue personal exploration.

Source: Reuters

OpenAI Hires First Chief Economist to Explore AI’s Economic Impact

OpenAI has appointed Dr. Aaron “Ronnie” Chatterji as its first Chief Economist, tasked with researching the economic implications of AI to ensure its benefits are widely shared. Dr. Chatterji, a Duke University professor with extensive experience in economic policy, will explore how AI can influence economic growth and job markets, and offer insights to guide policymakers and organizations in maximizing AI’s potential as an economic catalyst. His work will also support OpenAI’s developer community and business partners in leveraging AI for growth and competitive advantage.

Source: OpenAI

6. Research

Recent advancements in artificial intelligence highlight a dual trajectory: enhancing model efficiency while addressing ethical challenges. Innovations like OpenAI’s continuous-time consistency models (sCMs) and the BitNet architecture are significantly improving the efficiency of generative models and large language models, delivering high-quality outputs with reduced computational demands. Simultaneously, studies provide further evidence that AI models can exhibit biases based on user names, perpetuating harmful stereotypes, and that AI integration in scientific research, while boosting citation rates, may exacerbate existing inequalities.

Addressing Bias in ChatGPT Responses Based on User Names

Researchers investigated how ChatGPT’s responses are influenced by users’ names, revealing how subtle identity cues, such as names, can lead to variations in response. This study focuses on “first-person fairness,” examining biases that directly affect users rather than those applied by institutions through AI. Names carry cultural, gender, and racial implications, which makes them pertinent for exploring potential biases, as they are shared during interactions like drafting emails. The research aims to ensure that while ChatGPT personalizes responses, it does so without perpetuating harmful stereotypes or biases, which are challenges associated with language models that can inherit societal biases from their training data.

Source: OpenAI

TxT360: The Cutting-Edge LLM Pre-Training Dataset

The newly introduced TxT360 dataset, created by LLM360, is designed to transform large language model (LLM) pre-training with its meticulous deduplication of data from 99 CommonCrawl snapshots and 14 diverse, high-quality sources, including FreeLaw and PG-19. This dataset, spanning more than 15 trillion tokens, integrates high-quality curated data with extensive web-scraped content to surpass previous benchmarks set by datasets like FineWeb 15T. The sophisticated processing pipeline enables precise data weighting and distribution for improved model training, avoiding further filtering needs typical in other pre-training datasets. TxT360 highlights the importance of high-quality data and strategic source blending, offering detailed documentation to support future LLM training endeavors while addressing the unique preprocessing requirements of web and curated datasets.

Source: LLM360

AI Integration Boosts Scientific Citations but Exacerbates Inequality

An analysis of nearly 75 million scientific papers reveals that those mentioning AI methods are more likely to rank in the top 5% of most-cited works within their field, especially when AI terms are included in abstracts or titles. Despite this ‘citation boost’, the benefit is not experienced equally, with underrepresented groups in science not receiving the same citation advantages as their counterparts, suggesting AI may worsen existing disparities. While enhancing the understanding of AI’s role in science, this study also highlights discipline variability, with fields like computer science using AI extensively compared to others like history and art. Concerns arise that researchers might turn to AI solely for citation benefits, potentially impacting the quality and diversity of scientific approaches.

Source: Nature

AI Aids in Mediation for Opposing Views

A study by Google DeepMind has demonstrated the use of a chatbot powered by AI to help groups with differing opinions find common ground. The tool, leveraging a fine-tuned version of the DeepMind LLM Chinchilla, synthesizes opinions and produces comprehensive summaries that integrate multiple perspectives, often rated clearer and fairer than those crafted by human mediators. The experiment involved 439 participants in the UK, where the AI-generated summaries received a majority preference over human-written ones, suggesting its utility in enhancing citizen assemblies and deliberative polls. The research indicates that AI can play a significant role in democratic deliberations by providing inclusive summaries that can assist in formulating balanced policy proposals, though the human connection element remains a concern.

Source: Nature

AI-Driven Protein Design: From Concept to Competition

AI is revolutionizing protein design, with competitions emerging to evaluate the functional viability of AI-designed proteins. The recent rise in such contests has been fueled by AI tools like AlphaFold and protein language models, which have become popular for generating novel proteins that could serve as drugs or industrial enzymes. Competitions are advancing the field by democratizing access, accelerating validation, and standardizing development processes. However, experts caution that these competitions must carefully select problems and judging criteria to avoid misleading outcomes, though current initiatives appear to foster collaborative sharing of methodologies and results.

Source: Nature

Breaking Ground: Innovative Consistency Models Revolutionize AI Sampling

OpenAI has introduced a cutting-edge approach with continuous-time consistency models (sCMs) that promise revolutionary advancements in generative AI by dramatically improving sampling efficiency. These models achieve sampling quality comparable to top benchmark diffusion models using only two sampling steps, overcoming the traditionally slow sampling speeds of diffusion models that typically require many more steps. The sCMs are built on an advanced theoretical framework called TrigFlow, which simplifies existing model formulations and identifies the root causes of training instability. This innovation allows for the removal of discretization errors and hyperparameters, offering a more robust training process. The enhanced design enables these models to scale up effectively, accommodating large datasets with 1.5 billion parameters, and delivering targeted improvements in image generation on datasets like CIFAR-10 and ImageNet with significant reductions in the computational load necessary for high-quality outputs.

Source: OpenAI

BitNet: A New Efficient Approach for Scaling Language Models

BitNet, a novel 1-bit Transformer architecture introduced by researchers, aims to address the challenges posed by the increasing size of large language models, particularly their environmental impact due to high energy consumption. By utilizing BitLinear as a replacement for the nn.Linear layer, BitNet allows for the training of 1-bit weights from scratch, which significantly reduces memory usage and energy output. Experimental evaluations indicate that BitNet not only competes with existing methods in terms of performance but also surpasses them in efficiency, maintaining competitive results like full-precision Transformers while offering significant reductions in resource consumption. This architecture provides a promising pathway for scaling large language models more sustainably without compromising on performance.

Source: Microsoft

7. Arts & Entertainment

AI’s expanding role in media is sparking innovation and moral debates. Adobe’s Generative Extend in Premiere Pro allows editors to expand video and audio clips, while prominent creators like Björn Ulvaeus and Julianne Moore oppose unauthorized use of their work in AI training, pushing for ethical standards. Adobe is investing $100 million in global AI literacy through Coursera and others. Meanwhile, the Lenfest Institute, with OpenAI and Microsoft, launched an AI fellowship to support local journalism.

Adobe Boosts AI Literacy with Global Skilling Initiative

Adobe has launched a new global skilling initiative focused on enhancing AI literacy, aiming to equip 30 million people with skills in AI content creation and digital marketing. The initiative involves collaborations with Coursera, as well as various educational and alternative learning organizations, to offer comprehensive courses such as Generative AI Content Creation and Creative & AI Skills. Adobe has committed over $100 million this year to support this program and is offering $250,000 in scholarship licenses to improve access to its courses. The courses, available on Coursera, cater to a global audience, providing flexible learning paths and certificates such as the Content Creator and Graphic Designer Professional Certificates.

Source: Adobe

Adobe Debuts AI-Driven Generative Extend in Premiere Pro Beta

Adobe has introduced Generative Extend, a new AI-powered feature in Premiere Pro (beta), that allows video editors to seamlessly extend video or audio clips while maintaining lifelike quality. Powered by Adobe’s Firefly Video model, this tool helps fill gaps in footage, fix awkward cuts, and adds room tone to extend audio. Editors can easily integrate this functionality into their workflow without using original user content, ensuring commercial safety. Furthermore, the feature includes Content Credentials in the export process, enhancing transparency and recognition for creators by providing metadata detailing how the content was created using AI.

Source: Adobe

Notable Creators Condemn AI Use of Unlicensed Creative Works

A coalition of prominent creators, including Björn Ulvaeus, Julianne Moore, and Kazuo Ishiguro, have signed a statement opposing the unlicensed use of their creative works for training generative AI models. The statement argues that such practices pose a significant threat to the livelihoods of artists and writers and calls for these actions to be prohibited. The petition, available for public signature, aims to garner support from individuals from diverse professions to address and regulate the issue of unauthorized AI training using creative works, highlighting the need for ethical standards in AI development and training data usage.

Source: aitraningstatement.org

AI Fellowship Program Boosts Local Journalism

The Lenfest Institute for Journalism has partnered with OpenAI and Microsoft to launch the Lenfest Institute AI Collaborative and Fellowship program, designed to support local journalism through AI innovation. Five media organizations—Chicago Public Media, The Minnesota Star Tribune, Newsday, The Philadelphia Inquirer, and The Seattle Times—will receive grants and AI enterprise credits to hire AI fellows for two-year projects focused on enhancing business sustainability and integrating AI technologies. These projects will explore AI applications such as data analysis, audience engagement, transcription services, and advertising strategies. Through collaborative efforts and shared learnings, the initiative seeks to provide independent newsrooms with advanced AI tools, ultimately aiming to secure the future of local journalism.

Source: Lenfest Institute

8. Dev Alerts

The past two weeks have brought us some new dev-infused goodies, both in terms of new features, as well as some insightful cookbooks. On the official dev front, OpenAI has now introduced audio modality to its Chat Completions API, both on the inputs, as well as on the outputs. OpenAI also made a new addition to its playground, in the form of prompt and schema generation, and what OpenAI refers to as “meta-prompting”. Several new cookbooks have also been released, demonstrating custom GPT with actions towards GitHub API, model distillation API with some very impressive results, and finally a voice translation guide using the audio modality on the chat completions API. Finally, it was also confirmed that Realtime API is inflating costs, the root cause was identified, and OpenAI is working on bringing those costs down, so stay tuned!

Inflated Charging on Realtime API

In the following community thread, a number of users pointed out that Realtime API is significantly overcharging. It was later pointed out by one of the community members (lucasvan) that the significant portion of the cost (around 85%) is due to the accumulating input audio tokens over time, leading to inflated charges. Secondary cost inflation stems from accidentally generating a large amount of output tokens, such as requesting long responses - even if the user interrupts or doesn’t listen to the output, the tokens are still generated and billed. This analysis was confirmed by Jeff Harris from OpenAI, and he mentioned that OpenAI is working on potential solutions (including caching) in order to bring these costs down.

OpenAI's Chat Completions API Now Supports Audio

OpenAI has enhanced its Chat Completions API to support audio inputs and outputs, allowing users to receive responses in text, audio, or both. This update enables the generation of spoken audio responses and the use of audio inputs, which can convey richer data than text alone by capturing nuances such as tone and inflection. The API supports various use cases, including generating audio summaries from text, performing sentiment analysis on audio recordings, and facilitating asynchronous speech-to-speech interactions. Users can access these features via the REST API or OpenAI’s SDKs, with the gpt-4o-audio-preview model requiring either audio input or output. This development aims to enhance the versatility and interactivity of AI-driven conversations. Learn all the details in this guide.

AI-Powered Prompt and Schema Generation in Playground

OpenAI’s Playground introduces a Generate button designed to streamline the creation of prompts, functions, and schemas from task descriptions, leveraging AI to enhance efficiency. This feature employs meta-prompts and meta-schemas, which incorporate best practices to generate or refine prompts and produce valid JSON and function syntax. The meta-prompts guide the AI in understanding task objectives and improving existing prompts while maintaining clarity and conciseness. For schema generation, a pseudo-meta-schema is used to ensure adherence to strict mode constraints, despite limitations in supported features. This approach allows for the generation of structured outputs and function schemas, ensuring all fields are marked as required and additional properties are restricted, thereby enhancing the reliability and precision of AI-generated outputs. Explore the new capability further in this guide. There is a recent cookbook that explores this functionality.

Cookbook additions: Model Distillation, Voice Translation, GitHub GPT Actions

OpenAI has further expanded its cookbook collection with three new additions. The The cookbook “Leveraging model distillation to fine-tune a model” demonstrates how to use model distillation to transfer the performance of a larger model (gpt-4o) to a smaller model (gpt-4o-mini) for a wine grape variety classification task. The cookbook " Voice Translation into Different Languages" provides a comprehensive guide on using GPT-4o’s audio-in and audio-out modality to efficiently translate and dub audio content from English to Hindi, detailing steps for transcription, dubbing, and evaluating translation quality with BLEU and ROUGE scores. Finally, the cookbook " GPT Actions library - GitHub" provides instructions for developers connecting a GPT Action to GitHub.

Careers at OpenAI

OpenAI is hiring for 144 positions across various teams and locations. The majority of roles are concentrated in San Francisco, with additional opportunities in Tokyo, London, Dublin, New York City, and remote positions in Paris and Washington, D.C..

Compared to two weeks ago:

  • The focus seems to have slightly shifted towards post-training, human data, and safety roles, indicating evolving priorities in AI safety and system optimization.

  • There’s a slight reduction in remote job opportunities, though San Francisco continues to be the main hub.

  • Some unique operational roles have already been filled, while safety-focused and research-heavy positions have gained prominence.

Also: if you are interested in preventing Skynet, or just generally care about LLMs not being misused, check out this role!

For the full list of open job opportunities, see here.

9. Community Spotlight

We are happy to introduce the community spotlight for the very first time! The idea is to highlight interesting and constructive discussions, new community-led tools and SDKs, events, games, and other shenanigans that bring our dev community together. In this edition, we highlight a super interesting topic that exposed the overcharging on the Realtime API, and we summarize the collection of questions for the AMA with Sam Altman for tomorrow’s DevDay in London.

Overcharging over Realtime API

This topic demonstrated a true community spirit. gowner initiated the topic by bringing to our attention that in their experience, Realtime API was overcharging more than 2.5x compared to what the pricing page suggests. The community then got together and dug into these costs, ultimately culminating in lucasvan providing a detailed analysis of where and why the overcharging was happening. In short, most of the overcharge was due to input audio tokens being carried over. Jeff Harris from OpenAI acknowledged the issue and stated that OpenAI is looking into how to bring these costs down. This is a true example of how our dev community can have a very positive impact on the quality of OpenAI services. Well done to all community members who contributed to this thread!

DevDay London 2024 : Your Questions for an AMA with Sam Altman

In preparation for London DevDay, vb started a topic collecting questions from the community directed at Sam Altman’s AMA. We can summarize the collected questions/topics as follows:

  • AI-generated low-quality content: Concerns about the influx of low-quality AI content online (e.g., AI images, articles, videos) and whether OpenAI plans to address this issue, or if it’s outside their control.
  • Family Plan for ChatGPT: A request for a family subscription plan for ChatGPT Plus, with individual logins and shared prompt quotas, as current costs aren’t viable for families.
  • Pricing and Plan Options: Suggestions to introduce more affordable subscription options, such as paying for a limited number of prompts ($5/month for 500 prompts) instead of the current $20 flat fee.
  • Context length and memory: Questions about progress beyond the 128k context length and whether memory functionality (smart memory retrieval) will be integrated into APIs.
  • AI version of Sam Altman: Curiosity about whether Sam Altman is considering creating an AI bot version of himself with his voice and mannerisms.
  • Universal Basic Income (UBI): Asking if UBI is viable due to AI-induced job displacement and how it could be implemented globally, especially considering U.S.-based AI companies profiting from international customers.
  • New industries and economic models: What new industries or models Sam Altman foresees emerging as AI reshapes traditional sectors.
  • Custom GPTs: Concerns about the lack of updates for Custom GPTs, and whether they are still a priority for OpenAI.
  • Duties to original creators: Questioning whether OpenAI has a responsibility to the original creators (writers, musicians, etc.) whose work was used to train its models, often without consent.
  • Path to AGI: Requesting Altman’s thoughts on whether scaling up language models and improving algorithms will lead to AGI, or if entirely new technologies will be required to surpass LLM limitations.
5 Likes

This kind of stuff gets me so excited to orchestrate conversations instead of doing the typical, boring, single-thread locking I/O nonsense.

I am hoping for a near-future where we can host our own “audio models” that work in both RealTime API and the TTS services. Because I want to get mad with some borderline over-engineered solutions :crazy_face:

Since we’re talking about audio-related wishlists. I also would love if OpenAI introduced something like IPA so that we can add pronunciation to our TTS models.

Thanks for the information!

4 Likes

One can only dream to get Advanced Voice Mode capability level Models.

Once we get access to a model on that level, developing solutions will be SO much fun.

Currently the realtime model often struggles in languages other than english.

Accents somewhat work, but not really.

Can’t wait for improvements!

3 Likes

Super useful. Thanks! :wink:

2 Likes