|
Over the past five years, advances in AI models data processing and reasoning capabilities have driven enterprise and industrial developers to pursue larger models and more ambitious benchmarks. Now, with agentic AI emerging as the successor to generative AI, demand for smarter, more nuanced agents is growing. Yet too often smart AI is measured by model size or the volume of its training data. Data analytics and artificial intelligence company Databricks argues that todays AI arms race misses a crucial point: In production, what matters most is not what a model knows, but how it performs when stakeholders rely on it. Jonathan Frankle, chief AI scientist at Databricks, emphasizes that real-world trust and return on investment come from how AI models behave in production, not from how much information they contain. Unlike traditional software, AI models generate probabilistic outputs rather than deterministic ones. The only thing you can measure about an AI system is how it behaves. You cant look inside it. Theres no equivalent to source code, Frankle tells Fast Company. He contends that while public benchmarks are useful for gauging general capability, enterprises often over-index on them. What matters far more, he says, is rigorous evaluation on business-specific data to measure quality, refine outputs, and guide reinforcement learning strategies. Today, people often deploy agents by writing a prompt, trying a couple of inputs, checking their vibes, and deploying. We would never do that in softwareand we shouldnt do it in AI, either, he says. Frankle explains that for AI agents, evaluations replace many traditional engineering artifacts, i.e., the discussion, the design document, the unit tests, and the integration tests. Theres no equivalent to a code review because theres no code behind an agent, and prompts arent code. That, he argues, is precisely why evaluations matter and should be the foundation of responsible AI deployment. The shift from focusing on belief to emphasizing behavior is the foundation of two major innovations by Databricks this year: Test-Time Adaptive Optimization (TAO) and Agent Bricks. Together, these technologies seek to make behavioral evaluation the first step in enterprise AI, rather than an afterthought. AI behavior matters more than raw knowledge Traditional AI evaluation often relies on benchmark scores and labeled datasets derived from academic exercises. While those metrics have value, they rarely reflect the contextual, domain-specific decisions businesses face. In production, agents may need to generate structured query language (SQL) in a companys proprietary dialect, accurately interpret regulatory documents, or extract highly specific fields from messy, unstructured data. Naveen Rao, vice president of AI at Databricks, says these are fundamentally behavioral challenges, requiring iterative feedback, domain-aware scoring, and continuous tuning, not simply more baseline knowledge. Generic knowledge might be useful to consumers, but not necessarily to enterprises. Enterprises need differentiation; they must leverage their assets to compete effectively, he tells Fast Company. Interaction and feedback are critical to understanding what is important to a user group and when to present it. Whats more, there are certain ways information needs to be formatted depending on the context. All of this requires bespoke tuning, either in the form of context engineering or actually modifying the weights of the neural network. In either case, he says, a robust reinforcement learning harness is essential, paired with a user interface to capture feedback effectively. That is the promise of TAO, the Databricks research teams model fine-tuning method: improving performance using inputs enterprises already generate, and scaling quality through compute power rather than costly data labeling and annotation. While most companies treat evaluation as an afterthought at the end of the pipeline, Databricks makes it central to the process. TAO uses test-time compute to generate multiple responses, scores them with automated or custom judges, and feeds those scores into reinforcement learning updates to fine-tune the base model. The result is a tuned model that delivers the same inference cost as the originalwith heavy compute applied only once during tuning, not on every query. The hard part is getting AI models to do well at your specific task, using the knowledge and data you have, within your cost and speed envelope. Thats the shift from general intelligence to data intelligence, Frankle says. TAO can help tune inexpensive, open-source models to be surprisingly powerful using a type of data weve found to be common in the enterprise. According to a Databricks blog, TAO improved open-source Llama variants, with tuned models scoring significantly higher on enterprise benchmarks such as FinanceBench, DB Enterprise Arena, and BIRD-SQL. The company claims the method brought Llama models within range of proprietary systems like GPT-4o and o3-mini on tasks such as document Q&A and SQL generation, while keeping inference costs low. In a broader multitask run using 175,000 prompts, TAO boosted Llama 3.3 70B performance by about 2.4 points and Llama 3.1 70B by roughly 4.0 points, narrowing the gap with contemporary large models. To complement its model fine-tuning technique, Databricks has introduced Agent Bricks, an agentic AI-powered feature within its Data Intelligence Platform. It enables enterprises to customize AI agents with their own data, adjust neural network weights, and build custom judges to enforce domain-specific rules. The product aims to automate much of agent development: Teams define an agents purpose and connect data sources, and Agent Bricks generates evaluation datasets, creates judges, and tests optimization methods. Customers can choose to optimize for maximum quality or lower cost, enabling faster iteration with human oversight and fewer manual tweaks. Databricks latest research techniques, including TAO and Agent Learning from Human Feedback (ALHF), power Agent Bricks. Some use cases call for proprietary models, and when thats the case, it connects them securely to your enterprise data and applies techniques like retrieval and structured output to maximize quality. But in many scenarios, a fine-tuned open model may outperform at a lower cost, Rao says.He adds that Agent Bricks is designed so domain expertsregardless of coding abilitycan actively shape and improve AI agents. Subject matter experts can review agent responses with simple thumbs-up or thumbs-down feedback, while technical users can analyze results in depth and provide detailed guidance. This ensures that AI agents reflect enterprise goals, domain knowledge, and evolving expectations, Rao says, noting that early customers saw rapid gains. AstraZeneca processed more than 400,000 clinical trial documents and extracted structured data in less than an hour with Agent Bricks. Likewise, the feature enabled Flo Health to double its medical-accuracy metric compared with commercial large language models while maintaining strict privacy and safety. Their approach blends Flos specialized health expertise and data with Agent Bricks, which leverages synthetic data and tailored evaluation to deliver reliable, cost-effective AI health support at scaleuniquely positioning us to advance womens health, Rao explains. From benchmarks to business data The shift toward behavior-first evaluation is pragmatic but not a cure-all. Skeptics warn that automated evaluations and tuning can just as easily reinforce bias, lock in flawed outputs, or allow performance to drift unnoticed. In some domains we truly have automatic verification that we can trust, like theorem proving in formal systems. In other domains, human judgment is still crucial, says Phillip Isola, associate professor and principal investigator at MITs Computer Science & Artificial Intelligence Laboratory. If we use an AI as the critic for self-improvement, and if the AI is wrong, the system could go off the rails. Isola points out that while self-improving AI systems are generating excitement, they also carry heightened safety and security risks. They are less constrained, lacking direct supervision, and can develop strategies that might be unexpected and have negative side effects, he says, also warning that companies may game benchmarks by overfitting to them. The key is to keep updating evaluations every year so were always testing models on new problems they havent already memorized. Databricks acknowledges the risks. Frankle stresses the difference between bypassing human labeling and bypassing human oversight, noting that TAO is simply a fine-tuning technique fed by data enterprises already have. In sensitive applications, he says, safeguards remain essential and no agent should be deployed without rigorous performance evaluation. Other experts note that greater efficiency doesnt automatically improve AI model alignment, and theres no clear way to measure AI model alignment currently. For a well-defined task where an agent takes action, you could add human feedback, but for a more creative or open-ended task, is it clear how to improve alignment? Mechanistic interpretability isnt strong enough yet, says Matt Zeiler, CEO of Clarifai. Zeiler argues that the industrys reliance on a mix of general and specific benchmarks needs to evolve. While these tests condense many complex factors into a few simple numbers, models with similar scores dont always feel equally good in use. That feeling isnt captured in todays benchmarks, but either well figure out how to measure it, or well just accept it as a subjective aspect of human preference; some people will simply like some models more than others, he says. If the results from Databricks hold, enterprises may rethink their AI strategy, prioritizing feedback loops, evaluation pipelines, and governance over sheer model size or massive labeled datasets, and treating AI as a system that evolves with use rather than a onetime product. We believe the future of AI lies not in bigger models, but in adaptive, agentic systems that learn and reason over enterprise data, Rao says. This is where infrastructure and intelligence blur: You need orchestration, data connectivity, evaluation, and optimization working together.
Category:
E-Commerce
In 2022, Diarrha N’Diaye-Mbaye had achieved a lifelong dream: Ami Colé, her three-year-old beauty brand, was on the shelves of Sephora. In the wake of George Floyd’s murder in 2020, she’d received a wave of support from venture capitalists and retailers. But by this year, much of that interest had dried up. In mid-July, N’Diaye-Mbaye abruptly announced she would be shuttering her fledgling brand because she could not find enough capital to stay afloat. The news sent shock waves through the beauty industry, but it’s an increasingly familiar story for venture-backed Black-owned brandsparticularly those that scaled with the help of major retailers who went all-in on DEI after 2020’s racial reckoning. Tina Wells recently shut down Wndr Ln, the luggage brand she launched in partnership with Target, after the retailer cancelled all future orders. Thirteen Lune, a diversity-focused online retailer, went through insolvency proceedings last December. Many other black-owned beauty brands have closed in the wake of Trump’s election, including Beauty Bakerie, Ceylon, and Koils by Nature. Black founders are now trying to figure out what went wrong. For many, the answer is that investors and retailers like Target quickly launched diversity, equity, and inclusion (DEI) programs without long-term strategies to help Black-owned brands scale and find success. Ultimately, DEI was often perceived as a moral endeavor, rather than smart business. So it’s not that surprising that so many of them are now struggling. “DEI was synonymous with altruism, rather than strategy,” says Marcus Collins, a professor of marketing at the University of Michigan and the author of For The Culture. “They saw serving Black people as a good thing to do rather than seeing them as consumers with unbelievable buying power.” The Rise and Fall of Ami Colé At Sephora’s annual beauty festival last September, crowds gathered in a pavilion featuring the hottest up-and-coming brands to grace the retailer’s shelves. Ami Colé’s booth was designed to look like a Harlem hair salon, complete with a bright orange swivel chair and African-inspired baskets. It was a proud moment for founder Diarrha N’Diaye-Mbaye, who named her company after her mother, a Senegalese immigrant who opened a hair salon in New York. As she described in The Cut, she began building Ami Colé in 2019, but it wasn’t until 2021 that retailers and investors began returning her calls. As the Black Lives Matter uprisings spread across the country, the business community tried to address systemic racism by launching diversity, equity, and inclusion (DEI) programs. Target vowed to invest $2 billion in at least 500 Black-owned businesses; Walmart poured $100 million into a racial equity center; Sephora took a pledge to devote 15% of its shelf space to Black-owned brands. For months, Black founders received an influx of cash and interest from retailers: N’Diaye-Mbaye herself raised $1 million to launch her brand. But five years later, as the Trump administration wages war against DEI, the mood in the country has shifted, and support for Black entrepreneurs is drying up. Now N’Diaye-Mbaye does not have enough capital to keep her brand going, and she’s far from alone. According to Crunchbase, Black-owned beauty brands raised $16 million in 2024, a sharp decline from $73 million in 2022. This withdrawal of support for Black-owned brands is happening across product categories, and a wave of startups has quietly closed in recent months. Some of these brands’ founders are now in a worse position than they were before they received the DEI support; they’re dealing with debt, unsold inventory, and other liabilities. DEI Programs Had No Long Term Vision Karen Young, founder of the beauty brand Oui the People, predicted this wave of closures. In July 2024, she posted a TikTok video about Black founders she knew who were struggling to get the funding they needed to keep their businesses afloat, as support evaporated. While many brand founders dream of getting picked up by a national retailer, Young knows firsthand how expensive this can be. To launch Oui the People at Sephora, she had to buy enormous quantities of inventory, pay for displays and product samples, and pour a lot of money into marketing to get on consumers’ radars. This is consistent with other reporting I’ve done about how brands can spend upwards of $100,000 on in-store fixtures at Sephora, and must also provide product testers and samples. To get a spot on a seasonal display at the store, entrance can cost $250,000. And after all of that, Sephora takes a 65% cut in sales. To pay for all of this, Young raised $8 million in venture capital, led by New Age Capital. “The first thing Sephora’s merchants asked me was whether I had funding,” says Young. “You need capital to get off the ground. It’s only when you scale that you have a path to profitability.” Young did all of this work in 2019, before the DEI programs began popping up. This turned out to be a blessing, she says, because Sephora and her investors worked with her to come up with a plan for Oui the People to find its place in the market and achieve scale. In contrast, after Floyd’s murder, many companies pumped money into Black-owned brands without any sort of long-term strategy. “DEI can’t come without infrastructure,” she says. “Retailers brought in these very small Black-owned businesses across their stores, then just stopped there.” This is what happened to Ami Colé. (Sephora declined to comment; Ami Colé did not respond to our request for comment.) As N’Diaye-Mbaye writes in The Cut, she used her $1 million to launch at Sephora, but struggled to compete with brands that had access to far more capital. She eventually raised $2 million more from venture capital firms like G9 Ventures and Greycroft, but without further ongoing investment she sees no path to success. “Diarrha performed miracles on the capital she raised, more than comparable brands in her category, like Kosas and Saie,” says Young. “She created amassive shade range and cultivated a loyal customer base. But she hasn’t had the same access to capital as comparable brands.” Collins, the professor, says the sudden withdrawal of support for Black-owned brands is devastating for foundersand not just financially. “These entrepreneurs had hope,” he says. “They named their companies after their parents because they wanted to build a family legacy. And overnight, the rug was pulled out from under them.” Doomed to Fail Wells of Wndr Ln believes retailers treat Black-owned brands more poorly when they are brought in through DEI programs. She’s seen this firsthand across her two decade career in which she has owned a consulting business, launched her own brands, and also written books for both children and adults. “I’ve worked with 400 clients over the last two decades of my career,” she says. “Only two have come through DEI initiatives, and both were awful.” One of those experiences happened with Target, a retailer she has worked with closely for six years. In 2019, Target asked Wells to write a series of children’s books featuring a Black female lead charactersomething they felt was missing on their shelves. (Wells had previously published successful middle grade books.) This led to a bestselling series called the Zee Files, which was exclusively sold at Target. Later, Wells wrote a business book called The Elevation Approach, and Target invited her to create a line of coordinating home office products. In each case, Target poured substantial marketing dollars into the launches, which led to their success. “Target did it because it was good for business,” says Wells. “They felt the Black customer was worth cultivating and investing in.” Then, in the aftermath of Floyd’s murder, Target launched an internal committee called REACH, focused on improving racial equity throughout the company, including bringing on 500 new Black-owned brands. In 2021, REACH reached out to Wells, asking whether she would be interested in launching a luggage brand at Target. Given all of her positive experiences at Target thus far, Wells said yes and began developing Wndr Ln (pronounced Wonder Lane), a line of colorful suitcases and overnighters. She used her own money to manufacture the products. When Wndr Ln debuted in August 2023, Wells noticed a lack of support from Target compared to her previous projects. Target did not invest much in marketing, nor were products prominently displayed in store. (A Target spokesperson confirmed it carried the Wonr Ln collection, but says the company’s policy is not to comment on vendor relationships.) Shortly after the launch, Target cancelled all future orders. Wells never received a clear explanation about why this happened, but at the time, Target’s sales were in decline partly because of a massive consumer boycott over its Pride collection. Whatever the reason, Wells was left in a bind. “If you’re producing product for a single retailer and they cancel future orders, your business is dead,” says Wells. “I was left with massive liability I am still dealing with today.” (She cannot comment on the financial details of the end of this Target partnership for legal reasons.) For Wells, the problem with DEI programs is that they don’t often focus on driving profit and revenue. DEI is often seen as a moral issue, rather than an opportunity to bring in Black entrepreneurs who can target the valuable Black consumer. As a result, with a recession looming and the Trump administration attacking DEI, it is easy for brands to abandon Black-owned brands. “America is a capitalistic society,” says Wells. “The goal of Fortune500 companies is to increase shareholder value. The minute anything is not in alignment with that goal is not going to have long-term success.” Marcus Collins emphasizes that launching DEI programs were designed to address real problems. Historically, American companies have ignored the needs of Black and brown customers. Yet, there is abundant research showing that catering to diverse consumers is good for business. “The thing that gets in the ways is racism, so let’s just call it what it is,” says Collins. “Companies don’t cater to Black people because they don’t think Black people matter.” Going forward, Karen Young says that companies should focus on partnering with Black founders because they have better insight into the diverse consumers they’re seeking. And importantly, the business community needs to give these entrepreneurs access to the resources they require to succeed, including access to capital from banks and VCs. “These are baseline resources that other founders get,” says Young. “We’re just asking for the same treatment.” But in some ways, setting up DEI programs may not even be necessary in the years to come, Wells argues. It will soon be abundantly clear that brands will lose out financially if they don’t cater to Black and brown consumers, as white people become the minority in the U.S. by the 2040s. “America is becoming more diverse every day,” says Wells. “If you don’t want to serve your Black and brown customers, don’t worry. Someone else will come along to do it. That’s how capitalism works.”
Category:
E-Commerce
In the filtered water space, there is one company that has dominated brand awareness for decades. Water pitchers and filtration devices from Brita can be found in so many millions of homes and offices around the world that the term market saturation is more than just a pun. But there’s another water filtration company that, despite lower kitchen visibility, is actually a bigger player in the clean-water game. Culligan, founded in 1936 as a water softening and filtration service company, became known for its white-glove service. [Photo: Culligan] Often installed in basements or storage closets, Culligan’s equipment was as utilitarian as a water heater or furnace. Once the system was installed in a home or office, its users hardly gave it another thought, or look. “It was the technician that was actually working with the product,” says Kathy Chi Thurber, Culligan’s new global president of consumer products. “The products didn’t have to be beautiful, but the technicians had to be able to talk about our history, our capabilities, our research, and innovation.” Now, as a private 15,000-person company that pulled more than $3 billion in revenue in 2023, Culligan is embarking on a total brand and strategy overhaul. And aggressively so. Within the past five years, Culligan has acquired 362 companies in the clean-water industry, from local water purifiers to filter companies to component manufacturers. It’s positioning itself as a dominant player in a world where water safety and water scarcity are of increasing concern. [Photo: Culligan] Out of the basement and into your kitchen One priority is to start competing more directly in the consumer space, bringing its equipment out of the basement and into the hands of water drinkers everywhere. “We’d never really given an eye to the consumer, and that has 100% changed,” says Chris Quatrochi, chief product and technology officer at Culligan International. To venture into the Brita-dominated consumer market, Culligan turned to the industrial design firm Ammunition Group. Known best for its work designing Beats by Dre headphones and products for companies like Polaroid, Square, and Lyft, Ammunition was tasked with helping Culligan develop products that appeal to regular consumers. It also updated the brand to tell those consumers that Culligan is not the box-in-the-basement brand they may have known in the past. “Our portfolio has not been the greatest from a, I would say, beauty perspective,” Quatrochi says. “If you really want to show that you are leading edge from a water-quality perspective, you have to have a product that demonstrates that.” Ammunition started by applying its deep product design background to creating a water filtration pitcher that embodies this new company focus. Building on its 2020 acquisition of the water filter maker ZeroWater, Culligans ZeroWater Technology line of three handheld pitchers and two countertop dispensers is the companys first foray into the consumer space. [Photo: Culligan] Designing a better water pitcher Ammunition’s design focused primarily on the ways people actually use filtered water pitchers. “One of the constraints is putting it in your refrigerator,” says industrial designer Robert Brunner, Ammunition’s founder. Research into the market showed that more than 70% of water pitcher users, particularly those in the U.S. and Western Europe, keep their pitchers in the refrigerator, often in the door of the appliance. At the same time, most of the pitchers on the market don’t actually fit into a fridge door all that well. Their rectangular shape and bulging handle tend to take up a lot of space, and need more room around them to be moved in and out. [Photo: Culligan] Ammunition rethoght that form factor to better fit inside the refrigerator door, using a rounded square shape for the pitcher that allows it to fit more like a carton of milk. The pitcher also has an innovative open-ended handle that cuts down on its overall bulk and allows more stuff to fit in the refrigerator door’s shelves alongside it, while also being more ergonomically comfortable to carry and hold. “Figuring out how to have that single connection point for that handle so it’d be mechanically robust and reliableit was actually a fair amount of engineering effort to make sure that could work when it’s getting filled up with water,” Brunner says. “The handle is extremely important, because when this thing is full, it’s quite heavy, and you have to be able to manipulate it, carry it, pour it. We wanted to maintain this simplicity.” The design team also thought about the spout shape and the challenge of pouring water for people with dexterity and mobility issues. That led to considerations about one of the key parts of using a water pitcher: refilling it. Ammunition designed a sliding lid that makes holding the pitcher under the tap and refilling it easier. The lids circular shape became a recurring theme in the design of the pitcher line, as well as the broader work Ammunition is doing across Culligan’s other product and service categories. “The circular element is really the most natural shape to route water from one place to another, pipes being the most obvious example,” says Christopher Kuh, vice president of Ammunition’s industrial design studio. “It’s really an important and core element.” Another differentiating factor is the built-in water-quality meter. Measuring total dissolved solids (TDS) at the scale of parts per million, the digital meter slots into the pitchers and the countertop dispensers to give users a clear readout of how well the filter is functioningand when it’s time to replace it. “The TDS meter actually is going to start to read a value above zero at some point in time, which gives you a clear indication of the end of filter life,” Kuh says. In a clever turn, the meter can be removed from the pitcher or dispenser to dip into, say, a glass of water direct from the tap to see just how much the filtration system is doing. [Photo: Culligan] A bigger rebrand moment These design moves were informed by deep user research Culligan has conducted over the past three years. Thurber says Ammunition was game for putting its design prototypes in the hands of users from the very early stages and taking their feedback to inform new iterations of the designs before landing on a final product that looks and feels different from what’s already out there. “We all know who the major competitor is that has, like, 60% to 70% market share,” Thurber says. “It would be very hard to break through if we were not serious about what we wanted to do, and if we were not game-changing in our design and our functionality.” But this doesn’t mean Culligan is abandoning the more utilitarian water products that have kept it in business for nearly a century. Instead, Ammunition’s design approach for the pitcher is being extended throughout Culligan’s product offerings, including the industrial-scale water softeners and filtration systems that still live in basements and utility closets, as well as the company’s large and growing business in office water coolers. Some of those redesigned products will be coming online in the next year.
Category:
E-Commerce
All news |
||||||||||||||||||
|