AI Week 33 Roundup: SearchGPT Tests, Claude 3.5 Haiku, Llama 3.1, and More

AI Week 33 Roundup: SearchGPT Tests, Claude 3.5 Haiku, Llama 3.1, and More
From search innovations to safety enhancements, Week 33 brought significant updates across the AI landscape. Hereβs a friendly overview of whatβs important and why it matters.
OpenAI Quietly Tests SearchGPT
OpenAI has begun limited testing of SearchGPT, a web search tool that combines conversational responses with live results and citations. Initial reports highlight a focus on speed, transparency of sources, and a more user-friendly interface compared to traditional chatbot browsing modes (The Verge). A public placeholder is now available at search.openai.com.
Why it matters:
- This indicates a competitive move into the AI-powered search space, where Google, Microsoft, and startups like Perplexity are exploring answer-first interfaces.
- If implemented effectively, combining search and chat can minimize the need for multiple tabs and enhance trust through citations and previews of sources.
What to watch next: accuracy, freshness, and responsible citation practices. Previous issues with data scraping and sourcing highlight the high standards needed for search-integrated AI (The Verge).
Anthropic Launches Claude 3.5 Haiku
Anthropic has introduced Claude 3.5 Haiku, a fast and cost-effective model from the 3.5 series. This model aims to provide robust reasoning at lower latency while enhancing tool usage and offering the Artifacts feature for collaborative structured outputs (Anthropic).
Highlights:
- A model focused on speed for production environments where throughput and cost are critical.
- Improved coding and analytical capabilities compared to earlier Haiku versions, together with enterprise controls borrowed from Claude 3.5 Sonnet.
- This is part of a larger trend that includes new projects and enhanced tools for better automation.
Takeaway: The 3.5 series is honing in on practical trade-offs. In many situations, the swiftest viable model tends to prevail, provided the quality remains consistent.
Meta Doubles Down on Open Models with Llama 3.1
Meta has announced the release of Llama 3.1, featuring new 8B and 70B models available under open licenses, along with a more powerful 405B model accessible through services and partnerships. The new models are designed to improve multilingual abilities, reasoning, and tool usage, and they incorporate safety features like Llama Guard 3 (Meta AI).
Why it matters:
- Open weights for the 8B and 70B models keep the open-source community active, enabling fine-tuning and on-premises deployment.
- The 405B model, even if not released as open weights, sets a new standard for open models via APIs.
Apple Intelligence Moves Toward Launch
Apple has previewed Apple Intelligence for iPhone, iPad, and Mac, focusing on on-device privacy, writing and image tools, and a system-level assistant that can use models both on-device and in Private Cloud Compute when needed (Apple). Expect a gradual rollout that aligns with modern Apple silicon.
Why it matters: Apple is embedding AI features into the operating system, adopting a privacy-first approach that may set a standard for consumer AI user experiences.
Open-Source and Model Updates
- Mistral released Mistral Large 2, a robust generalist model aimed at reasoning and multilingual tasks (Mistral).
- Hugging Face shared insights regarding a May security incident involving compromised tokens, highlighting the importance of strict key rotation and least-privilege settings within ML pipelines (Hugging Face).
Policy and Safety: AI Act and Secure-by-Design
The EU AI Act became effective in early August, initiating phased obligations for providers. High-risk systems and general-purpose models now face tighter transparency and safety regulations on a defined timeline (European Commission).
Additionally, security baselines for LLM applications are evolving. The OWASP Top 10 for LLM Applications outlines common risks like prompt injection and insecure tool use, which are valuable for teams developing AI features quickly (OWASP). NIST has also published a draft of a Generative AI Risk Management Profile to assist organizations in identifying domain-specific risks and controls (NIST).
Research to Watch
- Self-Discover proposes allowing models to create their own problem-decomposition strategies prior to reasoning, which could enhance accuracy on complex tasks in clean benchmarks (Google Research).
- Mixture-of-Agents and related ensemble methods are showing potential for increasing reliability by coordinating numerous specialized models or roles, though they may raise costs and latency in actual workloads (arXiv).
Quick Hits
- Look out for more experiments that blend search and chat as evaluation methods for grounded answers, citations, and freshness continue to improve.
- Open model families remain attractive options for businesses requiring customization, control, and reduced unit costs.
- Security posture is crucial: rotate keys, segment infrastructure, and monitor activity involving models and tools.
Bottom Line
Week 33 showcased steady advancements: faster mid-tier models for production, serious strides towards AI-native search, and a stronger foundation for regulations and safety. For teams working with AI, the strategy is becoming clearer: choose the smallest model that meets quality requirements, support outputs with reliable data and sources, and integrate security and evaluation from the start.
FAQs
What is SearchGPT and how is it different from ChatGPT with browsing?
SearchGPT seems to be a dedicated search interface focused on delivering quick results, citations, and source previews, unlike a general chatbot that occasionally retrieves web pages. Details may adjust as testing progresses.
Is Claude 3.5 Haiku suitable for production?
Yes, for many tasks. It is designed for strong reasoning with lower latency and costs. Always conduct A/B tests against your current model on real applications to evaluate accuracy and throughput.
Can I deploy Llama 3.1 on-premises?
Yes, the 8B and 70B models come with open weights. The 405B version is available through services and partners but is not downloadable.
How should teams address LLM security risks?
Begin with secure-by-design principles: anticipate prompt injection and tool misuse, implement allowlists for tools and data sources, log model interactions, and regularly rotate credentials. The OWASP LLM Top 10 provides a useful checklist.
What changes with the EU AI Act now?
The law is now active with phased compliance deadlines. Providers of high-risk systems and general-purpose models should identify their obligations and keep track of implementing acts and guidance from EU authorities.
Sources
- OpenAI is Testing SearchGPT β The Verge
- SearchGPT Landing Page β OpenAI
- Claude 3.5 Haiku β Anthropic
- Introducing Llama 3.1 β Meta AI
- Introducing Apple Intelligence β Apple
- Mistral Large 2 β Mistral AI
- Security Incident Update β Hugging Face
- EU AI Act Overview β European Commission
- OWASP Top 10 for LLM Applications β OWASP
- Draft Generative AI Risk Management Profile β NIST
- Perplexity Scraping Controversy β The Verge
- Self-Discover β arXiv
- Mixture-of-Agents β arXiv
Thank You for Reading this Blog and See You Soon! π π
Let's connect π
Latest Blogs
Read My Latest Blogs about AI

Gemini is Coming Home: What to Expect from Googleβs Next Nest Cam and AI-Powered Smart Home
Google teases Gemini for the smart home and a new Nest Cam, with a full reveal next month. Hereβs what to expect, why it matters, and how it may work.
Read more