Author: Advanced AI Bot
The Edmond & Lily Safra Center for Ethics welcomes Julia Nefsky for a public lecture titled “Rescuing Ourselves from the Pond Analogy.” The lecture is open to all. Registration is required for in-person attendance. Register here. This event is also available via livestream. About the lecture: “We live in a world in which many people are suffering. How should this affect the way we live our lives?” This talk presents a paper that Professor Julia Nefsky is writing jointly with Sergio Tenenbaum, as part of a broader project that they are working on together. We live in a world in which…
Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to tackle “stop-and-go” waves, those frustrating slowdowns and speedups that usually have no clear cause but lead to congestion and significant energy waste. To train efficient flow-smoothing controllers, we built fast, data-driven simulations that RL agents interact with, learning to maximize energy efficiency while maintaining throughput and operating safely around human drivers. Overall, a small proportion of well-controlled autonomous vehicles (AVs) is enough to significantly improve traffic flow and fuel efficiency…
Digital technologies have transformed education over the last two decades. I’m only in my 50s, but when I went to school the most technologically advanced thing in class was a pocket calculator. Now, iPads and other tablets are commonplace. Museums and galleries the world over have integrated touch screens and interactive elements to their exhibits. Apps like Duolingo have brought language learning to smartphones. The fact these things have become normalized so quickly is a testament to the rapid way we’ve all seamlessly integrated new technologies into our lives. But there are limits to 2D technologies. While remote learning tools…
In this issue: We examine a new conversation segmentation method that delivers more coherent and personalized agent conversation, and we review efforts to improve MLLMs’ understanding of geologic maps. Check out the latest research and other updates. NEW RESEARCH SeCom: On Memory Construction and Retrieval for Personalized Conversational Agents Researchers from Microsoft and Tsinghua University propose a new method to help conversational AI agents deliver more coherent and personalized responses during complex long-term dialogue. Large language models (LLMs) are widely used to enable more complicated discussions across a broader range of topics than traditional dialogue systems. However, managing excessively long…
For example, I’ve been playing around with an experimental system I built for myself using GPT-3 designed to help me write a science fiction book, which is something that I’ve wanted to do since I was a teenager. I have notebooks full of synopses I’ve created for theoretical books, describing what the books are about and the universes where they take place. With this experimental tool, I have been able to get the logjam broken. When I wrote a book the old-fashioned way, if I got 2,000 words out of a day, I’d feel really good about myself. With this tool,…
August 1, 2024 Black Forest Labs released three new models Flux.1 – Pro, Dev and Schnell. The Pro version is not open source and is available through their API but DEV and Schnell are both open source and available to download via Huggingface page. Dev is a higher quality model than Schnell, but Schnell is much faster (4 steps). These are big models though both of them weight a whopping 23.8GB each and they require high level of VRAM to run. It is recommended that you have 32GB RAM. However, don’t be sad because there is a way to run…
Organizations are eager to move into the era of agentic AI, but moving AI projects from development to production remains a challenge. Deploying agentic AI apps often requires complex configurations and integrations, delaying time to value. Barriers to deploying agentic AI: Knowing where to start: Without a structured framework, connecting tools and configuring systems is time-consuming. Scaling effectively: Performance, reliability, and cost management become resource drains without a scalable infrastructure. Ensuring security and compliance: Many solutions rely on uncontrolled data and models instead of permissioned, tested ones Governance and observability: AI infrastructure and deployments need clear documentation and traceability. Monitoring…
Temporal consistency is critical in video prediction to ensure that outputs are coherent and free of artifacts. Traditional methods, such as temporal attention and 3D convolution, may struggle with significant object motion and may not capture long-range temporal dependencies in dynamic scenes. To address this gap, we propose the Tracktention Layer, a novel architectural component that explicitly integrates motion information using point tracks, i.e., sequences of corresponding points across frames. By incorporating these motion cues, the Tracktention Layer enhances temporal alignment and effectively handles complex object motions, maintaining consistent feature representations over time. Our approach is computationally efficient and can…
Text-guided image editing aims to modify specific regions of an image according to natural language instructions while maintaining the general structure and the background fidelity. Existing methods utilize masks derived from cross-attention maps generated from diffusion models to identify the target regions for modification. However, since cross-attention mechanisms focus on semantic relevance, they struggle to maintain the image integrity. As a result, these methods often lack spatial consistency, leading to editing artifacts and distortions. In this work, we address these limitations and introduce LOCATEdit, which enhances cross-attention maps through a graph-based approach utilizing self-attention-derived patch relationships to maintain smooth, coherent…
When implementing machine learning (ML) workflows in Amazon SageMaker Canvas, organizations might need to consider external dependencies required for their specific use cases. Although SageMaker Canvas provides powerful no-code and low-code capabilities for rapid experimentation, some projects might require specialized dependencies and libraries that aren’t included by default in SageMaker Canvas. This post provides an example of how to incorporate code that relies on external dependencies into your SageMaker Canvas workflows. Amazon SageMaker Canvas is a low-code no-code (LCNC) ML platform that guides users through every stage of the ML journey, from initial data preparation to final model deployment. Without…