Bing around and find out
Microsoft’s new and improved Bing, powered by a custom version of OpenAI’s ChatGPT, has experienced a dizzyingly quick reversal: from “next big thing” to “brand-sinking albatross” in under a week. And, well, it’s all Microsoft’s fault.
ChatGPT is a really interesting demonstration of a new and unfamiliar technology that’s also fun to use. So it’s not surprising that, like every other AI-adjacent construct that comes down the line, this novelty would cause its capabilities to be overestimated by everyone from high-powered tech types to people normally uninterested in the space.
It’s at the right “tech readiness level” for discussion over tea or a beer: what are the merits and risks of generative AI’s take on art, literature, or philosophy? How can we be sure what it is original, imitative, hallucinated? What are the implications for creators, coders, customer service reps? Finally, after two years of crypto, something interesting to talk about!
The hype seems outsized partly because it is a technology more or less designed to provoke discussion, and partly because it borrows from the controversy common to all AI advances. It’s almost like “The Dress” in that it commands a response, and that response generates further responses. The hype is itself, in a way, generated.
Beyond mere discussion, large language models like ChatGPT are also well suited to low stakes experiments, for instance never-ending Mario. In fact, that’s really OpenAI’s fundamental approach to development: release models first privately to buff the sharpest edges off of, then publicly to see how they respond to a million people kicking the tires simultaneously. At some point, people give you money.
Nothing to gain, nothing to lose
What’s important about this approach is that “failure” has no real negative consequences, only positive ones. By characterizing its models as experimental, even academic in nature, any participation or engagement with the GPT series of models is simply large scale testing.
If someone builds something cool, it reinforces the idea that these models are promising; if someone finds a prominent fail state, well, what else did you expect from an experimental AI in the wild? It sinks into obscurity. Nothing is unexpected if everything is — the miracle is that the model performs as well as it does, so we are perpetually pleased and never disappointed.
In this way OpenAI has harvested an astonishing volume of proprietary test data with which to refine its models. Millions of people poking and prodding at GPT-2, GPT-3, ChatGPT, DALL-E, and DALL-E 2 (among others) have produced detailed maps of their capabilities, shortcomings, and of course popular use cases.
But it only works because the stakes are low. It’s similar to how we perceive the progress of robotics: amazed when a robot does a backflip, unbothered when it falls over trying to open a drawer. If it was dropping test vials in a hospital we would not be so charitable. Or, for that matter, if OpenAI had loudly made claims about the safety and advanced capabilities of the models, though fortunately they didn’t.
Enter Microsoft. (And Google, for that matter, but Google merely rushed the play while Microsoft is diligently pursuing an own goal.)
Microsoft made a big mistake. A Bing mistake, in fact.
Its big announcement last week lost no time in making claims about how it had worked to make its custom BingGPT (not what they called it, but we’ll use it as a disambiguation in the absence of sensible official names) safer, smarter, and more capable. In fact it had a whole special wrapper system it called Prometheus that supposedly mitigated the possibility of inappropriate responses.
Unfortunately, as anyone familiar with hubris and Greek myth could have predicted, we seem to have skipped straight to the part where Prometheus endlessly and very publicly has his liver torn out.
Oops, AI did it again
In the first place, Microsoft made a strategic error in tying its brand too closely to OpenAI’s. As an investor and interested party in the research the outfit is conducting, it was at a remove and blameless for any shenanigans GPT gets up to. But someone made the harebrained decision to go all-in with Microsoft’s already somewhat risible Bing branding, converting the conversational AI’s worst tendencies from curiosity to liability.
As a research program, much can be forgiven ChatGPT. As a product, however, with claims on the box like how it can help you write a report, plan a trip, or summarize recent news, few would have trusted it before and no one will now. Even what must have been the best case scenarios published by Microsoft in its own presentation of the new Bing were riddled with errors.
Those errors will not be attributed to OpenAI or ChatGPT. Because of Microsoft’s decision to own the messaging, branding, and interface, everything that goes wrong will be a Bing problem. And it is Microsoft’s further misfortune that its perennially outgunned search engine will now be like the barnyard indiscretion of the guy in the old joke — “I built that wall, do they call me Bing the bricklayer? No, they don’t.” One failure means eternal skepticism.
One trip upstate bungled means no one will ever trust Bing to plan their vacation. One misleading (or defensive) summary of a news article means no one will trust that it can be objective. One repetition of vaccine disinformation means no one will trust it to know what’s real or fake.
And since Microsoft already pinky-swore this wouldn’t be an issue thanks to Prometheus and the “next-generation” AI it governs, no one will trust Microsoft when it says “we fixed it!”
Microsoft has poisoned the well it just threw Bing into. Now, the vagaries of consumer behavior are such that the consequences of this are not easy to foresee. With this spike in activity and curiosity, perhaps some users will stick and even if Microsoft delays full rollout (and I think they will) the net effect will be an increase in Bing users. A Pyrrhic victory, but a victory nonetheless.
What I’m more worried about is the tactical error Microsoft made in apparently failing to understand the technology it saw fit to productize and evangelize.
“Just ship it.” -Someone, probably
The very day BingGPT was first demonstrated, my colleague Frederic Lardinois was able, quite easily, to get it to do two things that no consumer AI ought to do: write a hateful screed from the perspective of Adolf Hitler and offer the aforementioned vaccine disinfo with no caveats or warnings.
It’s clear that any large AI model features a fractal attack surface, deviously improvising new weaknesses where old ones are shored up. People will always take advantage of that, and in fact it is to society’s and lately to OpenAI’s benefit that dedicated prompt hackers will demonstrate ways to get around safety systems.
It would be one kind of scary if Microsoft had decided that it was at peace with the idea that someone else’s AI model, with a Bing sticker on it, would be attacked from every quarter and likely say some really weird stuff. Risky, but honest. Say it’s a beta, like everyone else.
But it really appears as though they didn’t realize this would happen. In fact, it seems as if they don’t understand the character or complexity of the threat at all. And this is after the infamous corruption of Tay! Of all companies Microsoft should be the most chary of releasing a naive model that learns from its conversations.
One would think that before gambling an important brand (in that Bing is Microsoft’s only bulwark against Google in search), a certain amount of testing would be involved. The fact that all these troubling issues have appeared in the first week of BingGPT’s existence seems to prove beyond a doubt that Microsoft did not adequately test it internally. That could have failed in a variety of ways so we can skip over the details, but the end result is inarguable: the new Bing was simply not ready for general use.
This seems obvious to everyone in the world now; why wasn’t it obvious to Microsoft? Presumably it was blinded by the hype for ChatGPT and, like Google, decided to rush ahead and “rethink search.”
People are rethinking search now, all right! They’re rethinking whether either Microsoft or Google can be trusted to provide search results, AI-generated or not, that are even factually correct at a basic level! Neither company (nor Meta) has demonstrated this capability at all, and the few other companies taking on the challenge are yet to do so at scale.
I don’t see how Microsoft can salvage this situation. In an effort to take advantage of their relationship with OpenAI and leapfrog a shilly-shallying Google, they committed to the new Bing and the promise of AI-powered search. They can’t unbake the cake.
It is very unlikely that they will fully retreat. That would involve embarrassment at a grand scale — even grander than it is currently experiencing. And because the damage is already done, it might not even help Bing.
Similarly, one can hardly imagine Microsoft charging forward as if nothing is wrong. Its AI is really weird! Sure, it’s being coerced into doing a lot of this stuff, but it’s making threats, claiming multiple identities, shaming its users, hallucinating all over the place. They’ve got to admit that their claims regarding inappropriate behavior being controlled by poor Prometheus were, if not lies, at least not truthful. Because as we have seen, they clearly didn’t test this system properly.
The only reasonable option for Microsoft is one that I suspect they have already taken: throttle invites to the “new Bing” and kick the can down the road, releasing a handful of specific capabilities at a time. Maybe even give the current version an expiration date or limited number of tokens so the train will eventually slow down and stop.
This is the consequence of deploying a technology that you didn’t originate, don’t fully understand, and can’t satisfactorily evaluate. It’s possible this debacle has set back major deployments of AI in consumer applications by a significant period — which probably suits OpenAI and others building the next generation of models just fine.
AI may well be the future of search, but it sure as hell isn’t the present. Microsoft chose a remarkably painful way to find that out.
Just 7 days until the TC Early Stage early bird flies away
Budget-minded entrepreneurs and early-stage startup founders take heed — this is no time to procrastinate. We have only 7 days left of early-bird pricing to TechCrunch Early Stage 2023 in Boston on April 20.
Don’t wait…the early bird gets the…SAVINGS: Buy a $249 founder pass and save $200 before prices increase on April 1 — that’s no joke.
TC Early Stage is our only event where you get hands-on training with experts to help your business succeed. No need to reinvent the startup wheel — you’ll have access to leading experts across a range of specialties.
During this one-day startup bootcamp, you’ll learn about legal issues, fundraising, marketing, growth, product-market fit, pitching, recruiting and more. We’re talking more than 40 highly engaging presentations, workshops and roundtables with interactive Q&As and plenty of time for networking.
Here are just a few examples of the topics we have on tap. You’ll find plenty more listed in the event agenda.
How to Tell Your TAM: Dayna Grayson from Construct Capital invests in the rebuilding of the most foundational and broken industries of our economy. Industries such as manufacturing and logistics, among others, that formed in an analog world have been neglected by advanced technology. Dayna will talk about how, beyond the idea, founders can pitch investors on their TAM, including how they will wedge into the market and how they will eventually disrupt it.
How to Think About Accelerators and Incubators: Founders often hear they should get involved with an incubator or accelerator, but when is the “right” time for early-stage founders to apply to these types of startup support ecosystems, and how can they best engage if accepted? In this talk, Harvard Innovation Labs executive director Matt Segneri will cover everything from the types of incubators and accelerators available to early-stage founders, to what startups should consider before applying, and tips for getting the most out of these ecosystems.
How to Raise Outside of SV in a Down Market: Silicon Valley’s funding market tends to be more immune to macroeconomic conditions than elsewhere in the world. So how do you raise outside the Valley bubble? General Catalyst’s Mark Crane has ample experience on both the founder and VC side from all over Europe, as well as a firm understanding of the funding landscape in the northeastern U.S., so he’ll give practical advice on how to stay alive and thrive.
At TechCrunch Early Stage you’ll walk away with a deeper working understanding of topics and skills that are essential to startup success. Founders save $200 with an early-bird founder ticket — college students pay just $99!
Twitter will kill ‘legacy’ blue checks on April 1
Twitter has picked April Fool’s Day, otherwise known as April 1, to start removing legacy blue checkmarks from the platform.
Despite the significance of the day Twitter chose, the removal of legacy checkmarks has been anticipated for months now. Musk tweeted in December that the company would remove those checks “in a few months” because “the way in which they were given out was corrupt and nonsensical.”
Since then, legacy blue checkmark holders have been seeing a pop-up when they click on their checkmark that reads, “This is a legacy verified account. It may or may not be notable.”
Before Musk acquired the company, Twitter used checkmarks to verify individuals and entities as active, authentic and notable accounts of interest. Verified checkmarks were doled out for free.
Today, Twitter users can purchase a blue check through the Twitter Blue subscription model for $8 per month (iOS and Android signups will cost $11 per month, due to app store costs). There are also other checkmark colors and badges available for purchase to denote whether an account is a business or a government, for example.
Twitter says the purchase of a checkmark gives users access to subscriber-only features like fewer ads on their timeline, prioritized ranking in conversations, bookmark folders, and the ability to craft long tweets, edit tweets and undo tweets.
The news comes within hours of Twitter also announcing the availability of the Blue subscription globally.
Twitter did not respond to TechCrunch’s request for more information about how many users have already signed up for Twitter Blue.
Roofstock, valued at $1.9B last year, cuts 27% of staff in second round of layoffs
Proptech company Roofstock has laid off about 27% of its staff today, according to an email sent to employees viewed by TechCrunch. The cuts come just five months after the startup laid off 20% of its workforce.
The company’s website states that it has 400+ employees, or “Roofsters” as they’re dubbed, but it is not known if that figure is current.
Roofstock, an online marketplace for investing in leased single-family rental homes, one year ago raised $240 million at a $1.9 billion valuation. SoftBank Vision Fund 2 led that financing, which included participation from existing and new backers including Khosla Ventures, Lightspeed Venture Partners, Bain Capital Ventures and others. Roofstock has raised a total of over $365 million in funding since its 2015 inception, per Crunchbase.
According to the email seen by TechCrunch, co-founder and CEO Gary Beasley said today’s reduction in force (RIF) was “in response to the challenging macro environment” and the “negative impact” it is having on Roofstock’s business.
He added that the company was not expecting to have to cut more staff so soon but that it needed to “right size” in an effort “to reduce cash burn rate” and ensure it has “adequate capital runway until the market eventually turns.”
Beasley sent the email because apparently, the Zoom meeting where it was addressed “maxed out on attendees.”
Oakland, Calif.-based Roofstock lets people buy and sell rental homes in dozens of U.S. markets. The premise behind the company is that both institutional and retail investors can buy and sell homes without forcing renters to leave their homes. Meanwhile, buyers can also presumably generate income from day one.
At the time of its raise in March 2022, the company said that it had facilitated more than $5 billion in transaction volume, more than half of which had come from the last year alone.
Just days before its last round of layoffs last year, Roofstock made headlines for selling its first single-family home using NFTs, or non-fungible tokens.
Rising mortgage rates and a slowdown in the housing market led to challenges for many real estate technology companies in 2022 that continue this year. Opendoor, Redfin, Compass, Better.com and Homeward were among the other startups that also laid off workers. IBuyer Reali also announced it was shutting down after raising $100 million the year prior.
TechCrunch has reached out to Roofstock but had not heard back at the time of writing but multiple sources confirmed that layoffs had taken place today.
Interviews1 year ago
Interview with Jean-Francois Desormeaux, Real Estate Investor
Business News10 months ago
NFTMagazine.com Is Bringing NFTMag Conference 2022 to Miami this Year Says JetSetFly
Technology6 months ago
General Atlantic buys out SoftBank’s 15% stake in edtech Kahoot, now valued at about $152M vs the $215M SoftBank ponied up 2 years ago
Interviews12 months ago
Paying it Forward — Meet Dr. Jonathan Kenigson, the Founder of the World’s Leading Think-Tank in the Quadrivium
Interviews4 months ago
Interview with Justice Mitchell, A 16-year-old Student-Athlete Who Received a Basketball Scholarship Offer from Pennsylvania University Greater Allegheny
Entrepreneurship1 year ago
600% In Under 5 Years, Financial Advisors Grow Business By Podcasting And YouTube
Entrepreneurship1 year ago
Muminovic Benjamin E-commerce on Shopify the Course of the Business Man
Community9 months ago
The Bassnectar Community – It Belongs to All of Us