Excerpts - Attention Factory
June 20, 2021
The moment a video is uploaded to TikTok, the clip and its text description are queued up to go through an automated audit. Computer vision is used to analyze and identify elements within the clip, which are then tagged and categorized with keywords. Videos suspected of violating the platform’s content guidelines are flagged for human review. The audit cross-checks the footage against a massive archive for duplicate content. This system is designed to prevent plagiarism, as well as the practice of downloading popular videos, removing the watermark, and reuploading them to a new account. Videos identified as duplicate have their visibility significantly reduced. After the screening process, the video is released to a small pool of a few hundred active users. Metrics such as the number of complete views, 6 likes, comments, average play length, and shares will be analyzed to gauge the video’s popularity within its vertical category. Those that perform well pass through to the next level, where the video is then exposed to thousands of active users. Again, more metrics will be evaluated, with the top-performing videos passing on to the next level, where they gain exposure to an even larger audience. As the video moves up to higher tiers, it will gain exposure to potentially millions of users. The process isn’t entirely run by an algorithm. At the higher levels, a person on the content moderation team will watch the video and follow a set of strict guidelines to confirm that it does not violate the platform’s terms of service or have any copyright issues. There are cases of videos reaching up to a million views only to suddenly be taken down once they hit the human review process. On a platform like TikTok,
Rather than placing their eggs in one basket and boosting the metrics for just one account, unscrupulous marketers further increased their odds of success by operating tens and even hundreds of similar accounts. Longer videos were cut up and edited into shorter content sections using various filters and effects to evade the duplicate content detection systems.
A set of established rules emerged for operating mass numbers of accounts at scale while avoiding detection. New accounts needed to be “cultivated” before the platform would judge them trustworthy. This involved mimicking the expected behavior of a typical user. Immediately batch uploading pre-edited videos on a new account was a sure-fire way to get “shadow-banned,” 7 rendering the account useless. The best practice was to passively watch videos for at least seven days before posting anything and record the first few videos within the app using the phone’s camera.
“I read a lot of books.” 28 During this time, he voraciously consumed informative autobiographies. Favorite titles include Steven Covey’s productivity classic 7 Habits of Highly Effective People and the business bible Winning by legendary American CEO Jack Welch. Microsoft rewarded following processes and focusing on details.
This anecdote again highlights Yiming’s disregard for traditional approaches when making high stakes decisions and skepticism of the status quo.
During his time at Kuxun, Yiming had gotten to know another young entrepreneur, Wang Xing, a small-time hustler who would later become one of the most influential figures in China’s business world.
Wang Xing persuaded Yiming to join his latest venture, Fanfou, a clone of Twitter built for the China market. Yiming was like a fish in water, back in his entrepreneurial element after time away at big tech Microsoft. He served as a technical partner responsible for Fanfou’s search features, trending topics, and social analytics. The site developed fast and, at the time, was considered one of the brightest stars of the Chinese internet.
Yiming later reflected on his time working at Fanfou, saying one of the big realizations for him was how social networking and obtaining information were two separate things. On a platform like Fanfou, or Twitter, users were both communicating with friends, staying in touch, and sharing thoughts. At the same time, they were also consuming information such as breaking news or articles of interest. These two activities were easy to conflate but represented two different needs. Clarity around this distinction later helped Yiming in setting ByteDance’s early direction.
With Fanfou’s website shut down, Yiming was left with little to do and waited two months before leaving. That was when venture capitalist Joan Wang pounced on the opportunity and persuaded Yiming to join a new venture. Joan was the managing director of SIG China Investment, a Chinese venture investment arm of the U.S. financial giant SIG.
When dealing with staff management issues, Yiming had a reputation for being very gentle. When unsatisfied with a staff member’s performance, he would look to tackle the situation with mild reasoning and sincere encouragement, an approach that employees have described as having a charm of its own. Many, including Rubo, commented that Yiming considers anger a useless emotion and a form of mental laziness. Instead, he strived for an ideal state “between mild joy and mild depression.”
Joan is, without a doubt, the most important investor in ByteDance’s history. 35 Although she could have hardly known so at the time, her meeting in the café that day with Yiming was a career-defining moment, the kind of deal that every venture capitalist dreams of. When ByteDance eventually goes public,
SIG In its home market, American financial services firm Susquehanna International Group (SIG) is essentially a hedge fund with investment banks and a research business focused on the public markets. However, SIG China, the regional arm where Joan worked, acts independently as a venture capital firm investing in private unlisted companies. 38 Entering China in 2005, SIG China initially made joint investments across varied industries, even delving into areas such as mining and industrial dye companies. Joan, who grew up in China, studied electrical engineering at the State University of New York and went on to work in the IT and telecom industry for thirteen years before joining SIG as a partner in 2006 based out of Beijing. One of her first investments, where she also came to know Yiming, was Kuxun.
For the company’s first year, home was Floor 6, Section D, Building 4, of the Jinqiu Gardens housing compound. Ten minutes’ drive from the Zhongguancun technology hub in northwest Beijing and surrounded by major research universities such as the Beijing Institute of Technology and Tsinghua University.
The team set up shop in a large four-bedroom, two-bathroom apartment and filled it with Ikea office furniture. Rent was 20,000 yuan a month (approx. $3,170 at the time). This was a popular strategy for early-stage startups before they gained traction and secured enough investment to transition to more formal premises. It was China’s equivalent to American startups’ basement and garage offices.
Toutiao was also winning over users with a superior experience. Even within the limited category of news apps, it outran others in product innovation, consistently iterating and being the first to introduce micro-optimizations that today are taken for granted. Unlike competitors, dragging down from the top of the feed caused the page to always refresh with new content. Knowing that network conditions were often adverse on people’s commute to work, the app preloaded articles and displayed lower-definition pictures when necessary.
Their existing apps acted essentially as self-owned acquisition channels through which they could funnel in users at no cost, particularly effective on early Android devices. ByteDance was extremely frugal in their first years with controlling costs. Throughout 2012 they spent one million yuan ($158,000 at the time) on promotion and gained more than one million active users by the end of the year. The acquisition cost of a single activated user being less than 0.1 yuan ($ 0.016). 51 Building trashy meme apps like “Hilarious Goofy Pics” was simply a means to an end. These apps were a smart way to cheaply acquire users, which could later be converted over to the mothership platform Toutiao.
the Chinese tech world subscribed to the adage that “traffic is king,” and there always were ways to monetize a business once it had achieved scale. The key concern was ever getting to that scale in a market that had already been divided up.
Fast forward a year, and after leaving Y Combinator, Matt’s startup had been acquired by Twitter, and he decided to take a break from the intense startup life to visit China for two weeks. Matt was curious to learn more about the thriving tech scene and excited to scour around for potential angel investing opportunities. Through mutual connections, he was introduced to Joan from SIG, who put him in touch with Yiming. That was how Matt ended up in building 4 of Jinqiu Gardens, standing in one of the converted apartments that were ByteDance’s offices.
Later once all the local Chinese venture capitalists had turned down ByteDance, Joan came back to Matt to ask if he knew anyone in the U.S. who might be interested. One name immediately popped into Matt’s mind—Yuri. As it turned out, Yuri Miler’s investment company DST (Digital Sky Technologies), already had offices in Beijing. It was one of the most prolific and successful foreign investors in China’s internet industry, having backed an all-star lineup of Chinese companies, including Alibaba, JD.com, Meituan, Didi Chuxing, and Xiaomi.
“Leanback” as a way to queue up a series of videos to automatically play in sequence. They went so far as to outfit top content partners with professional camera equipment and even held livestreaming events. But the most significant change by far was betting on “YouTube Channels.” Channels was a way for users to easily subscribe to and watch collections of content from a single source, much like old-fashioned television channels. So it wasn’t long before the homepage was redesigned around this new concept with a big blue “add channels” button as the primary call to action (see this chapter’s opening page). YouTube spent a whopping $100 million on cutting deals with premium content creators,
as a way to queue up a series of videos to automatically play in sequence. They went so far as to outfit top content partners with professional camera equipment and even held livestreaming events. But the most significant change by far was betting on “YouTube Channels.” Channels was a way for users to easily subscribe to and watch collections of content from a single source, much like old-fashioned television channels. So it wasn’t long before the homepage was redesigned around this new concept with a big blue “add channels” button as the primary call to action (see this chapter’s opening page). YouTube spent a whopping $100 million on cutting deals with premium content creators, including celebrities like Madonna and Shaquille O’Neal, Hollywood production companies, and professional wrestling organization WWE. Their choice of partners reflected YouTube’s ultimate goal at the time; to transform itself into a TV-like entertainment destination.
However, a year later, the metrics showed that the average amount of time spent by users on YouTube remained flat, according to data from ComScore. 70 Basically, none of the changes had much of an impact. It turned out that spending more on better content was not the right strategy because the problem was not that YouTube didn’t have great content. As YouTube’s engineering director Cristos Goodrow explained, “We believe that for every human being on earth, there’s 100 hours of YouTube that they would love to watch. And the content is already there. We have billions of videos.” 71 The problem was matching this vast amount of content to the right users. Encouraging people to subscribe to channels was one way of ensuring the content they liked reached them, but it still wasn’t proving to be as effective as they had hoped.
the YouTube recommendation team found themselves relying on a version of a 12-year-old item-to-item collaborative filtering algorithm initially developed by Amazon back in 1998. 74
The beginnings of a revolution 2011 saw a breakthrough as Google started implementing a new machine learning system called Sibyl to make recommendations on YouTube. 75 The impact of Sibyl was immediate; by applying better technology to its recommendations, YouTube engineers discovered they had added a rocket-ship booster to the site’s viewing numbers. The machine learning worked so well that soon, more people were choosing what to watch based on the “recommended videos” list than any other way of picking videos, such as web searches or email referrals.
Google continued to iterate and further optimize the recommendation system, later switching from Sibyl to using Google Brain developed by the company’s now-famous moonshot laboratory group Google X, led by Stanford professor Andrew Ng. Google Brain leveraged groundbreaking new advances in deep learning. Whereas Sibyl’s impact had already been impressive, the results of Google Brain were nothing short of astounding.
Yiming resolved to do neither; instead, he built a computer program from scratch that would automatically crawl ticketing sites and notify him when a spot became available. Creating the program over a single lunch break, it secured him a ticket in less than thirty minutes.
In general, recommendation systems rely on two key processes: “content-based filtering,” and “collaborative filtering.” The two concepts being relatively easy to grasp. A content-based filtering system will recommend content to users, similar to what they already like to consume. If the user enjoys watching videos with dogs and has been tagged as “dog lover,” the system will recommend more dog videos. A collaborative filtering system will base their recommendations on finding groups of users who enjoy similar content. Say Jane’s and Tracey’s interests are highly correlated. If Jane watches a video multiple times all the way through, a reliable indicator of interest, then the system will also recommend that video to Tracey. Above Left: Content-based filtering, Above Right: Collaborative filtering. Internet informational distribution These various methods aren’t mutually exclusive; it’s possible to make use of them all. It’s also rare for a platform to rely on one single approach to content distribution—what site or app doesn’t use search in some form? However, most platforms lean towards one method being primary. Platforms can evolve, with the mix of methods being relied upon shifting over time. An excellent example of this is YouTube, which went from relying heavily on channel subscriptions at one stage to firmly embracing recommendations.
Someone with a voracious demand for information, such as an intellectual or a journalist, will likely have a preference for subscription and search. These methods hold the highest degree of accuracy and control. They also require the user to be more proactive and engaged, typing in search terms, and curating subscription lists. People with lower demand for information are more likely to hold a preference for social and recommendation. These methods are accessible and lend themselves well to news and light entertainment.
Active methods (subscription and search) are better for larger screen devices often used for serious work or study, where session times tend to be longer, and keyboards allow for accurate and fast input. Passive methods of content distribution are, in general, more suitable for the fragmented time and small screens of smartphones. Before
“The subscription model is too demanding on the users’ end. They need to know ‘what I like and what I will subscribe to.’ Users will also be on the fence about whether they should subscribe to publications with a mix of interesting and uninteresting content.
For search on both desktop and mobile, the clear market leader was Baidu. With a strong moat of technology and brand recognition. In the same way as Americans say, “Let me Google that,” Chinese also use Baidu as a verb. Baidu’s stranglehold on the lucrative search market would remain unchallenged. The 800-pound gorilla of the Chinese mobile internet was “super-app” WeChat, which at the time embraced two ways of distributing media content, official accounts that followed a subscription-based model and a newsfeed entitled “moments.” The newsfeed was, in many ways, the embodiment of WeChat founder Allen Zhang’s philosophy. His stance at the time towards algorithmic recommendation could be described as, at best, suspicious, at worst, dismissive. 90 In his mind, WeChat’s “moments” was meant to be a place for authentic communication between people. The feed was simply reverse chronologically organized posts from the user’s contacts, even using filters on photos was removed as an option. Above: the primary methods of distribution used by various Chinese platforms around 2013 By contrast, Sina Weibo, China’s other giant gorilla of the mobile internet, was a media-focused organization. They had won their lucrative position in dominating microblogging, the so-called “Twitter of China,” not through having the best technology or user experience but by getting a critical mass of big-name celebrities and media outlets to embrace the platform. Weibo tagged users based on accounts they had subscribed to and used this as an indicator of their general interests guiding choices over which content to recommend. But it did not see improvement in this crude recommendation technology as an essential business driver. The company was focused on better managing and extracting value from the influencers on their platforms and expanding their reach to the vast pool of users in lower-tier Chinese cities, a high growth market just opening up. Investing in better recommendations was low down the priority list.
As part of the drive, Yiming got wind of an upcoming book, Putting Into Practice Recommender Systems . 91 It was perfect for them and written by one of China’s leading experts in machine learning at the time—Xiang Liang 92 . Yiming personally reached out to Xiang, who was then working as a researcher at video streaming site Hulu, to request a copy yet was rejected on account that the book was not yet published.
Fortunately for Yiming, Baidu was late to realize the importance of personalized recommendation. Baidu’s search business was so unrivaled and profitable that it was blindsided by coming threats from ByteDance even as they were falling behind in the new area of mobile.
The first technological breakthrough came in 2014 when ByteDance lured away Deputy Director of Search, Yang Zhenyuan, from Baidu, where he had been working previously for nine years. Yang was immediately given the title vice president of technology and set about orchestrating a major technical upgrade.
Yang’s joining opened the flood gates to many other Baidu engineers following him as ByteDance went all out to swipe the rich technical talent pool of Baidu with large pay packets and generous share options. By 2015-16, ByteDance started to break away and create a sustainable advantage in recommendation technology after capturing other big-name Baidu talents, including Chen Yuqiang and Zhu Wenjia 94 . Zhu Wenjia later led the team responsible for developing the original recommendation systems used by Douyin and TikTok.
The company’s technical expertise had reached such a level by 2016 that they could experiment with methods to algorithmically generate content. During that year’s Olympics, a ByteDance-developed bot wrote original news coverage, publishing stories on major events faster than traditional media and enjoying engagement levels comparable to articles produced by human writers.
ByteDance’s system centers around three profiles: the content profile, the user profile, and the environment profile. For the content profile, Cao gave the example of a written news article about an English Premier League football match between Liverpool and Manchester United. Keywords would be extracted from the article using natural language processing, in this case, “Liverpool Football Club,” “Manchester United Football Club,” “English Premier League,” and names of several key players from the game such as “David de Gea.” Relevance values are then assigned to the keywords. In the example, “Manchester United Football Club” was 0.9835, and “David de Gea” was 0.9973, both very high as to be expected. The content profile also includes when the article was published, which helps the system calculate when it has become outdated and stop recommendation. The user profile is built from various sources, including one’s browsing history, search history, the type of device they are using, the device location, their age, gender, and behavioral traits. Users were divided into tens of thousands of latitudes based on social data and user behavior mining to build different profiles. As you read posts recommended by the platform, it learns your preferences by tracking your behavior: what you choose to read, what you opt to dismiss, how long you spend on a piece of content, which articles you comment on, and which stories you choose to share. Finally, the environment profile is based on where the user consumed the content, for example, at work, at home, or during a commute on the subway, as people’s preferences vary given the different situations. Other environmental traits include the weather and even the stability of the user’s internet connection and which network they were on (e.g., Wi-Fi or China Mobile 4G.) The system computes the strongest statistical match between the content profile, user profile, and environment profile that will optimize the percentage of articles read and the percentage of the articles finished (i.e., time spent). This content distribution process involves allocating a “recommendation value” to each newly published story based on its quality and potential readership. The higher the value, the more suitable people the article will be distributed to. The story’s recommendation value changes as users interact with it. Positive interactions such as likes, comments, and shares increase the recommendation value; negative actions such as dislikes and short reading times lower the value. The value also decreases over time as the content becomes out of date. For fast news cycle categories like sports or stock prices, a day or two could be enough for the value to see a significant decrease. For more evergreen categories, such as lifestyle or cooking, the process is slower. Focusing
The beauty of relying on recommendations to improve engagement is that it creates a virtuous cycle of continual improvement over time, often referred to as a “data network effect.” The more time spent using the app, the more enriched becomes the user profile, which leads to more accurate content matches and better user experience. This naturally leads to more time being spent in the app, which further enriches the user profile and so on.
article’s recommendation value. This virtuous cycle is powerful but doesn’t continue indefinitely. The rate of improvement in user experience is initially fast but will asymptote over time as the user’s profile is enriched further and further to the point where a fully accurate and detailed interest graph has been formed.
Growth hacking - China style The Shenzhen airport warehouse 3A was filled with phones, hundreds of thousands of them. Wall to wall, pallet after pallet, a seemingly endless stack of smartphones fresh from the factory production lines. Later that day, all of them would be loaded onto planes and shipped out across China’s major cities, weaving their way through a byzantine system of provincial distributors, sub-distributors, and retail store networks before finally landing in the hands of a consumer. A group of young men and women in grey overalls lined up for early morning duty. To a casual observer, these looked like a typical team of warehouse workers ready to load and unload cargo all day. Instead, the group had a very different task. “Right lads, you know the drill—five minutes for each batch of 12 phones, not a second more. Let’s go!” called the group leader. They immediately set to work. Another day’s worth of repetitive tasks lay ahead, following the same sequence: Use a special device to blow hot air on the phone box seal until the tape fell off. Carefully unbox the phone, making sure to keep everything in pristine condition. Hook the phone up to the machine, a thick plastic box, with a screen roughly the size of an iPad and a row of 12 USB ports. 100 Tap to select the correct options, then press “confirm.” Wait. Once the machine had done its thing, unhook the phone and put it back in the original box exactly as you found it, and reseal the tape. The whole process took no longer than five minutes and was repeated hour after hour, day after day. Just 86 of these machines running continuously over an eight-hour work shift would be enough to cover 100,000 low-end to mid-range priced Android devices. The goal was to batch pre-install over a dozen extra apps onto each phone, one of which was Toutiao. Back in Beijing, Yiming and executive Zeng Qiang were pouring over the pre-installs spreadsheet as had become their daily routine. Numbers of total installations and activations from each distribution channel and manufacturer were laid out in neat rows. Multiple factors were broken down and analyzed in detail: 30-day retention rates, device models, A/B tests, the coverage rates of China’s myriad of cities and townships. A complex system had evolved to optimize the budget for what had become ByteDance’s most effective way to growth hack new users and fast track their company – the grey market of cutting deals with distributors to pre-install apps onto phones after they’ve left the factory but before they reach the consumer.
When ByteDance started allocating budget into app pre-installation, they paid roughly 0.4 yuan per install ($ 0.06), which was above market rate at the time but still incredibly cheap given that over four years that price continually rose to more than 12 yuan ($ 1.68). The app pre-installs practice worked because most consumers were either ignorant or indifferent to what software should be packaged on their phones. Buyers focused on the price, brand, and specs of the hardware; whatever software was on the device beyond the Android operating system simply didn’t matter to most people. Many of the preinstalled apps were either deleted or perhaps used only once out of curiosity. Toutiao was as well-placed as any to convert those who did give their app a try. Reading news and other web content was a high-frequency activity and a stable need for most people. Toutiao’s logo, a newspaper with a bright red banner and characters reading “Headlines,” left little doubt about what the app was for.
Few retailers questioned app pre-installs as the practice was rife, and, in an industry where the competition was cut-throat and margins thin, pre-installs were a welcome source of side revenue. As pre-installing apps became more lucrative, multiple levels of distributors and agents in the distribution chain embraced the practice. The manufacturer installed their set of apps, the first-level distribution agent added their batch, the second-level distribution agent added another round, even the retail stores themselves might add a few.
Other less orthodox but still effective methods of gaining installs involved hiring the services of “ground promotion” companies. Their typical tactics would require recruiting groups of female college students to stop people on the street and encourage them to install an app in return for a gift or a small cash reward. The methods didn’t work well with young people but unsurprisingly found favor with older men.
In the early years of Toutiao, ByteDance acquired tens of millions of users through pre-installation. This led to a situation where the company’s core user base was people using cheap Android devices, many of which were purchased with Toutiao already pre-installed. Over time, the content preferences of this group began to greatly affect the public’s perception of the company.
Everyone knows that junk food is bad for you, but people still love to eat it. ByteDance vehemently denied actively pushing vulgar content 105 yet was undeniably in the business of giving people what they want. It just so happened that what the masses of China wanted to feed into their brains every day was the mental equivalent of a big greasy cheeseburger—clickbait, celebrity gossip, and pictures of pretty girls. “You think that the whole of China comprises of social elites? The college education rate is only 4%,” continued Gao Han.
Yiming wished to headhunt Lidong and invited him to come and talk at the company offices. As soon as he entered the meeting room, Yiming wrote the words “user volume, click rate, conversion rate, unit price, CPM, CPC” on a small whiteboard and put together a long list of complex and esoteric calculation formulas. The technical boss then spent the next few hours explaining the derivation process of these equations. Lidong later confessed that he didn’t understand any of it, but that mattered little as Yiming’s approach of using math to deduce the profit model of advertising had shocked him greatly.
2015 Wuzhen Summit, China’s most prestigious internet conference. Yiming’s position on the furthest left side at the back indicates he is arguably the least important person present. In contrast, Alibaba CEO Jack Ma, Tencent CEO Pony Ma, Baidu CEO Robin Li, and LinkedIn founder Reid Hoffman stand centrally and directly behind Chinese leader Xi Jinping.
It didn’t take long for the Mindie team to notice their new competitor. A search for “Mindie” on the App Store resulted in Musical.ly appearing right next to them. The two apps shared the same keywords. “At the beginning, we were really surprised… Everything was the same, even some of the App Store description, the logo color and gradients,” explained co-founder Stanislas. The team realized they had made a mistake, leaving part of their code up on developer site Github which Musical.ly pounced on and used to speed up their development. Searching through user profiles, they found Alex Zhu’s account and saw he had been a highly active early adopter.
To understand the sentiment of the users, Alex spent a great deal of time on the app and registered multiple fake accounts pretending to be a regular user. 140 He would comment on others’ videos and ask why they shared or created certain content, a user research tactic commonly deployed by Chinese tech bosses. They brought hundreds of their early loyal users over into WeChat messaging groups to have daily conversations. For every new feature, they shared mockups to gain immediate direct feedback. One of the first changes made based on these talks was to extend the length limit of the videos to 15 seconds to match the limit imposed by Instagram, the platform where teens wished to share their videos most.
In trying to encourage a sense of community amongst their early adopters, the team discovered that the most effective way was to regularly promote “challenges.” Challenges are essentially user-generated video memes. 141 A challenge sets up a replicable, cookie-cutter structure that allows anyone to take part and make their own version. They could be anything from a simple set of dance moves to a goofy prank.
On Musical.ly, users were encouraged to join trending challenges and make their own version. Challenges were a way of educating users and showcasing new ways to create videos. It also allowed the Musical.ly team to have greater control over the direction of content creation. The challenges gave users a purpose to participate and create videos rather than just passively watch others. “The key difference between Musical.ly and Vine is that on Musical.ly we lowered the barrier to content creation, so all the consumers are creators at the same time,” explained Alex. As he saw it, lowering the barrier to content creation was critical to Musical.ly’s success, where all content was produced by users. The most significant barrier to people creating videos wasn’t a technical one. All the new breed of short video mobile apps had easy-to-use mini-editing studios built in. Young users especially had no trouble working out how to add music, text and use the recording function. In particular, Vine’s recording feature was about as simple as one could imagine – point the camera and hold down a button to record for six seconds. Shyness also wasn’t a problem as many young users loved recording themselves. The more substantial barrier was one of creativity and inspiration—coming up with a concept. Most users need to be inspired. Very few people had as much time, talent, and dedication as already famous Vine stars such as Zach King, notable for his mastery of digital editing techniques to create magic effects. Music was one form of creative inspiration; on Mindie or Musical.ly, people could easily choose a favorite track to mime or dance to. But the Musial.ly team found that the promotion of daily challenges was a much more effective way to build a regular habit of content creation. Users didn’t need to think that much. All they had to do was follow the crowd and add their spin on an already familiar theme. Copying others wasn’t just okay; it was actively encouraged. Challenges also helped combat the final most difficult barrier of all—motivation . There was a sense of immediacy. Users either chose to participate in the fun challenge while it was trending today or risk missing out. Participation also gave people a sense of being part of a wider community.
Alex. As he saw it, lowering the barrier to content creation was critical to Musical.ly’s success, where all content was produced by users.
By the end of 2014, Musical.ly had established a core group of loyal users. Constant daily communication with hundreds of these Musical.ly fans through WeChat messaging groups meant that the team was close to their users despite being on the other side of the world. They were learning the nuances of American teen culture, and through the communication, fans were feeding them new ideas for challenges to promote or amplify. The team also continually monitored which videos gained traction on the platform and promoted ones they felt could engage and inspire others.
This hands-on daily work was time-consuming, manual in nature, and challenging to automate. American internet firms favored scalable, data and technology-driven growth methods. Yet Musical.ly was drawing from China’s standard playbook, where such low-tech tactics were commonplace and labeled under the term “operations.” 145 It was a solid base, but they had failed so far to reach a tipping point that would allow them to scale. The founders were hesitant over whether to keep the business running, given the difficulty of securing funds, and their team was stripped down to a skeleton crew of just seven people. “Sometimes slow growth is worse than failing fast,” reflected Alex.
Dubsmash “Lip-synching goes viral!” the smartly dressed news reporter exclaimed in serious-sounding tones while turning to face the studio camera. “That’s right, Jane, new app Dubsmash is taking the world by storm!” interjected her newsroom co-host. “The app has been downloaded more than 10 million times since it launched in October, now that’s impressive !” In early 2015 news about Dubsmash was spreading like wildfire across global mainstream newspapers and television. The app was an overnight success built by three German engineers led by CEO Jonas Druppel. Dubsmash did just one thing—let users create ten-second lip-sync videos. People loved it. Above: Screenshots of Dubsmash, taken from the app store description in 2015.
Unfortunately, Dubsmash faced the common challenge for small teams that unexpectedly met big success early on: their back-end infrastructure was utterly unprepared for the explosive growth. There was no user account system, 146 no login or registration, and no way to add friends or interact with other users. There was also no way to post within the app; videos were downloaded to the phone and shared through other social media platforms. This made it difficult to retain users, and the hype around Dubsmash faded. The overnight success demonstrated the compelling use case for lip-syncing, and it was also an early indicator of what would happen next for Musical.ly.
Lip Sync Battle – the turning point In early April 2015, the Muscial.ly team in Shanghai noticed something unusual in its download numbers. Every Thursday night, there was an abnormal surge in installs. The Shanghai team set about trying to understand what caused this, conducting extensive research online and through its user feedback groups. Finally, the team hit upon the answer—Lip Sync Battle . Lip Sync Battle was a new TV competition co-hosted by rap per LL Cool J on the now-defunct American television channel, Spike TV. 147 A spin-off from the popular Jimmy Fallon Show it was a success straight out the door. The first episode, which aired on April 2, was the highest-rated premiere in the channel’s history. During and after the show, some of the audience would search for apps that let them record miming videos. Many stumbled upon Musical.ly, leading to the spike in downloads.
Building an economy, the Musical.ly way Being a platform where all the content is user generated required Alex and Louis to think carefully about nurturing an active community. They needed to ensure there was a stable group of committed creators regularly producing high-quality content. This would provide them with stickiness and longevity,
Alex compared the strategy they took to that of “nation-building.”
“You have to attract immigration, and in order to do so, you’ve got to make a small percentage of people rich first,” he
“You have to attract immigration, and in order to do so, you’ve got to make a small percentage of people rich first,” he said. In his conceptualization, Musical.ly was like a newly discovered continent in need to entice new immigrants, much as the American colonies had been at one time. With few people, the new continent had a small gross domestic product. Distributing this pool of wealth evenly would result in everyone living miserably and failing to attract more immigrants. According to Alex, the solution was to deliberately foster an economy with a high degree of income inequality, which allocated most of the GDP to a small group of pioneers—in Musical.ly’s case, early users. Once this group got rich, the news would then spread to trigger a gold rush with others following in the footsteps of the first wave of immigrants, eager to test their fortunes in this new world. The latecomers would also have a shot, or in Alex’s words, “let them live the American dream.” He likened this process to transitioning from a centralized, planned economy to a market-driven, decentralized one, nurturing a rising middle class. 157 The big stars could remain wealthy, but there needed to be ways for the talented but obscure, new creators to be discovered and rewarded. “Getting rich” was a metaphor for securing the heightened social status of being “Musical.ly famous.” The team had a god-like power to manipulate attention on the platform and ensure individual accounts received massive exposure. The content operations team could manually select at their discretion videos to be tagged as “featured.” Such videos would be actively promoted and were guaranteed high visibility. Under this system, just a few individuals working out of a nondescript co-working space in suburban Shanghai wielded considerable influence over American teens and the pre-teen culture by choosing which videos to feature, as illustrated by the #DontJudgeChallenge. Using the metaphor of immigration between countries was Alex’s roundabout way of saying Musical.ly set the rules of the game. They drove vast traffic to specific individuals such as Baby Ariel, catapulting
one, nurturing a rising middle class. 157 The big stars could remain wealthy, but there needed to be ways for the talented but obscure, new creators to be discovered and rewarded.
them overnight to online influencer status. As the platform grew, this group became outright celebrities, in some part due to their creativity, persistence, and hard work, in most part due to the invisible hand of the Shanghai content operations team tilting the attention game heavily in their favor.
attracting new followers was an uphill battle. Typically, teens had neither the photography skills nor the glamorous lifestyle necessary to be an Instagram influencer, nor did they have a witty writing style to build a Twitter account. But lip-sync and dancing were well-suited. Teens were creative, obsessive about sharing, and had plenty of free time after school. They were comfortable shooting videos and held few inhibitions about using their phone’s front-facing camera to record themselves. The Musical.ly team leaned into the youthful demographic with the app’s design characterized by generous use of loud, garish colors.
Singer Jason Derulo became the first major artist to adopt Musical.ly for sharing videos of himself dancing from his studio. Seeing his success in gaining teen attention, other celebrities were quick to grasp Musical.ly’s marketing power. The floodgates opened with artists such as Selena Gomez, Lady Gaga, and Katy Perry, all jumping on the platform to connect with the young audience. Some began using it to drum up excitement for upcoming tracks by sharing snippets of unreleased songs.
“When you want to grow early on, you want to be a brush, meaning you have to b e very specific; you have to solve a specific need very well… Later you want to be a canvas; you want all kinds of things to happen on this blank canvas.”
For example, YouTube began as a video hosting tool allowing people to embed videos on web pages for free, a revolutionary concept at the time. Instagram’s initial traction came from the “killer feature” of filters. People found that with a few simple taps, Instagram could make their photos look more professional.
with a few simple taps, Instagram could make their photos look more professional.
“Here in China, it’s a different way—if you do something right… they will follow, they will do exactly the same thing … People here have a very different business logic, they think you can always get a market share very quickly by powering in with money, and once you have that, you can kick the others out… people don’t have that patience to grow up very gradually. Everything happening in China is so fast, so people lose their patience.”
The responsibility to manage the Musical.ly clone would lie with Kelly Zhang, who oversaw user-generated content platforms, known in the industry simply as UGC. In her thirties with short hair and glasses, Kelly Zhang had been acquired by ByteDance a long with her startup—online picture community, Picture Bar 176 —in
She was an internet industry veteran and a serial entrepreneur having been a founding employee of two previous internet companies. 177 Kelly had a reputation for understanding how to cultivate communities, an essential process for UGC platforms. This discipline was quite different from the model of Toutiao, where users passively consumed but didn’t create themselves. Kelly oversaw multiple ByteDance platforms, and the Musical.ly clone was just one project among many.
but didn’t create themselves. Kelly oversaw multiple ByteDance platforms, and the Musical.ly clone was just one project among many.
Amazing youth Xiao’an was nervous—she was mid-way through her interview at ByteDance’s head office. Finishing up at university, she was looking to enter the workforce with her first real job. She had interned one summer at ByteDance previously, and last week an old colleague from back then, Xiao’wei, had sent her a message: “I’m working on a new project. Would you like to take a look?” Now, here she was, sitting in the interview room. She smiled somewhat awkwardly again, trying her best not to show her nervousness. The interviewer leaned over to show her something on a phone. “The team you would be joining is working on a new app. This is a test version,” Xiao’an smiled politely and looked down at the handset. The video immediately jumped out at her from the screen. “What the hell is this? I’ve never seen that before,” she thought to herself. The video took up the entire screen and automatically looped. The “like” and “share” buttons were overlaid on top of the actual video itself. “This app is weird,” She thought to herself. “What do you think?” asked the interviewer. “Oh, interesting,” was all she could politely muster. Xiao’wei went on to arrange for her to meet some of the team members at a bustling barbeque restaurant off Zhichun Road, not far from the office. Summertime was when evening dining spilled out onto the street—the scene was loud and lively. People chinked glasses and toasted one another after a long day of work. Over cold bottled beer and juicy meat skewers, Xiao’an was introduced to the A.me team. Zhang Yi was a product manager with tattooed arms who loved extreme sports. Every weekend, she would go out to the suburbs to ride motorbikes. Li Jian was a third-year university student. Xiao’wei had discovered him playing guitar on a live-streaming app. He had never worked in the internet industry before. “Damn. This was a very young team. It’s going to be a lot of fun,” Xiao’an recalled her initial impressions of meeting her prospective colleagues. The team was inexperienced, but there was something that drew her to the group. ByteDance by now had over 2,000 employees. The A.me team was less than ten staff working together in a small section of the head office’s second floor. It was operating somewhat like a small startup within the much larger organization. The app was officially registered under a separate company, Beijing Weibo Vision Technology Co., Ltd., controlled by Yiming’s old university roommate, Rubo. No one in the team had much real experience making a full app from scratch. Around ten engineers were brought over and took a week to build the beta version of the app that Xiao’an had seen during her interview. It was a mess and full of bugs. Frustrations arose between the design and engineering teams. Aspects of the layout that the designers felt were blindingly apparent had to be explicitly marked up for engineering to build as they wished. 24-year-old designer Ji Ming recalled: “The things I worked on so hard to get out every day were so unsatisfactory. I really felt bad.” To improve user experience, the team invited students from nearby middle schools to chat at the company. They talked through what kind of apps they liked to use and why. Ji Ming found the students could only make judgments based on the existing apps in the market. It was difficult for them to imagine something they hadn’t already experienced. This reminded the team of the famous Steve Jobs quote: “ A lot of times, people don’t know what they want until you show it to them.”
The lack of people uploading interesting videos was a massive issue for the team. Yet without proving that the app could retain and grow their tiny user base, it would never receive proper support from the parent company.
Solution: treat them like royalty The first thing the team did was to treat its existing small band of popular creators like royalty. The operations team would chat with them individually every day, earnestly listening to their ideas, and making them feel they were participating in the platform’s growth and molding its direction. If a Beijing-based user encountered a problem that was not easy to explain online, they would be invited to chat over a free meal at ByteDance’s company cafeteria. Just like Musical.ly, the team made use of theme-based challenges to build communities and actively encouraged users to riff off each other’s creations, building upon shared memes. Users were free to initiate their own challenges, which helped the team learn what kind of content people preferred and guide them in selecting their official challenges. Many of the first official challenges originated not from the team but ideas that came out when talking with the early video creators. They would reward the best video creators with gifts such as cameras, celebrity merchandise, or snacks, yet another way to make people feel special. Top content creators were ranked within the app on “Most Popular List,” “Most Active List,” or “Weekly Rookie List” – helping foster a sense of community. The tactics A.me was using are part of what is termed “operations.” Heavy reliance on operations to achieve growth is one of the defining characteristics of the Chinese internet. At Western tech companies, the role of user acquisition typically falls under the marketing, sales, or growth teams, which tend to systematically achieve user growth through highly scalable data and technology-driven methodology. In addition to these established techniques, Chinese companies also favor using manual, labor-intensive methods to promote and grow their platforms. Examples include paying for celebrity endorsements and media exposure, operating promotional accounts on other platforms, or running regular competitions and festival holiday promotions. Operations teams are typically active all day, maintaining relationships with outside stakeholders, including users, creators, and promotional partners.
A.me went as far as assigning account managers to individual creators who would do everything possible to please them, even buying them dinners and helping with school assignments or relationship issues. Next to the team’s workstation was a big box filled with all kinds of props for shooting videos, from wigs and glasses to funny placards. When an early user celebrated their birthday, the operations team would record an exclusive video for them. Around Christmas time, a team intern even went out his way applying for a credit card to order a Christmas tree on Amazon for Xue Lao’shi, the Canadian student who had initially refused to join.
order a Christmas tree on Amazon for Xue Lao’shi, the Canadian student who had initially refused to join.
The young designer tasked with the redesign had his epiphany when attending a rock concert lit up by beams of swirling colorful lights with a surrounding dark stage. Inspired by the psychedelic visuals at the live event, he set about creating an image that would capture the show’s euphoric feel and decided to toy around with the musical note ♪ sign. He ran the icon through various filters and settled on using something known as a glitch effect. The style is reminiscent of the static distortions of an old television suffering from a weak signal. The overall impression was one that perfectly conveyed the feeling of shaky movement. The musical note itself had been altered to form the shape of the letter “d,” signifying the first letter of the app’s name Douyin. The logo precisely matched what the team wanted the product to be. It was different, creative, and instantly recognizable. Unfortunately, this future vision was still far away from the reality of where the app actually was. The numbers didn’t lie—the data throughout the first half-year of operation was unimpressive.
Nation building: the Douyin way With its successful rebranding and a larger budget, Douyin was now repositioned upmarket as a trendy app for fashionable young urban elites. But to pull off such a transition, the team needed to solve their biggest problem head-on—the lack of quality young content producers. The solution was art college students. The Douy in team went deep into art schools across the country to scout for good-looking students to be its users. Altogether, the team convinced hundreds of them to join, with the promise to help them get famous online. This proved highly effective. The influx of users helped build an original content pool and establish the app’s tone as cool and fashionable. The operations team then started to manipulate the visibility of videos to encourage the type of trendy content they wished to cultivate. Consequently, videos that didn’t conform to the community’s tone and values would struggle to gain visibility. Douyin mobilized its entire staff to lure in creators from competitors. They scouted
for suitable talent across all major Chinese social media platforms and even overseas Chinese on Musical.ly, messaging them one by one. To speed up the process, they also began to strike deals with “multi-channel networks” (MCNs), a type of organization that emerged in YouTube’s early years to represent and professionally manage a collective of creators. Simultaneously, the team was aggressively setting up accounts on other short video and social media platforms where they posted watermarked Douyin videos. As with Musical.ly, the watermark was the key. It was like a mini advert, for people intrigued by the videos would see the watermark and search for the name on the app store. The unique user ID of the video creator was later added to the watermark next to the flashing Douyin logo. This small but critical change further encouraged people to share their videos on other platforms, as this could now lead people back to their accounts and grow their fan bases. The operations team would continuously scan the platform for content that could be used for promotion on other platforms. By February, it saw the first hint that Douyin might have a chance at breaking out. A dance meme entitled “The Backrub Dance” 184 that had originated on Douyin started to organically spread out onto many other platforms. In March, another video caught the team’s attention. It was an uncanny
impersonation of a famous comedian, Yue Yunpeng, similar in both look and style. The team repeatedly pinged the celebrity’s official social media accounts. After much persistence, they eventually attracted his attention, and the comedian shared the impersonation to his millions of followers. The video, watermarked with the glittering Douyin logo, received over eighty thousand likes and more than five thousand forwards. Douyin’s Baidu index, a local equivalent of Google trends, rose considerably the next day.
I’m not cool enough to use this Douyin continued to double down on its brand positioning as an app for the coolest of the cool. It began a cinema advertising campaign in the early summer of 2017, a 30-second ad slot 185 that was a fast-paced tour de force of wild shaking camera effects matched to a loud thundering electronic beat. One industry professional described their experience seeing the commercial as “dazzling,” adding “this product is too cool, so cool it’s not suitable for me.” 186 Cinema ads were backed up by online promotional campaigns of short, playful, and interactive adverts featuring famous historical figures such as the Mona Lisa and Abraham Lincoln using Douyin. Amusing, original, and well-executed, these adverts went viral across Chinese social media garnering curiosity and building brand awareness.
Abraham Lincoln using Douyin. Amusing, original, and well-executed, these adverts went viral across Chinese social media garnering curiosity and building brand awareness.
That summer, Douyin clinched a sponsorship deal for a new talent show, The Rap of China . 187 The program was endorsed by several big-name celebrities and became a breakout hit with China’s urban youth. Hip Hop culture became all the rage, in part because of the show. Douyin was the perfect tool for young Chinese to create their own rapping, breakdancing, beatboxing, and street dance videos. Female contestant VaVa, sometimes referred to as “China’s Rihanna,” commented: “All the people into Hip Hop are all on Douyin.” 188 Somewhat echoing how Lip Sync Battle had been a game-changer for Musical.ly in the U.S. market, shows like “Hip Hop in China” and its rival program “This! Is Street Dance,” pushed Douyin to the forefront of China’s youth culture.
all on Douyin.” 188 Somewhat echoing how Lip Sync Battle had been a game-changer for Musical.ly in the U.S. market, shows like “Hip Hop in China” and its rival program “This! Is Street Dance,” pushed Douyin to the forefront of China’s youth culture.
Knowing that young Chinese people were highly sensitive to the personal image they displayed online, a dedicated engineering team was set up to build best-in-class beautifying filters and special effects for Douyin. These lowered the barrier to content creation and gave confidence for users to shoot without makeup.
The crowd goes wild for Yiming Hundreds of fashionably dressed young people were arriving at 751 D.PARK, an expanse of industrial plants redeveloped into a hip culture venue in northeast Beijing. They were clad in baseball caps, brightly colored dresses, loose-fitting hip-hop style streetwear, and limited-edition sneakers.
The breakout October 1st marks the beginning of “Golden Week,” a seven day long official Chinese national holiday. Periods like these are big opportunities for China’s internet industry. People’s behaviors change for a week; many find more time for entertainment and to try new things. Over October, Douyin’s daily users doubled from seven to fourteen million; two months later, they reached 30 million. Over those three months, the 30-day retention rates jumped from eight to over 20%, the average time spent in the app soared from 20 to 40 mins. 192 It was as if some magic rocket fuel had suddenly been added, boosting every key metric. What had changed? The answer was Zhu Wenjia. Zhu Wenjia, hired from Baidu in 2015, was widely considered to be one of the top three best people in the entire company when it came to algorithm technology. 193 He ran one of ByteDance’s most capable engineering teams and had recently been assigned to work on Douyin. The team’s work harnessing the full power of ByteDance’s content recommendation backend led directly to the astounding October results. The better the metrics, the more resources ByteDance placed behind the app as it now had good retention and was fast-tracked into becoming a strategically important product. Suddenly support was
coming in from all over the company—people, money, user traffic, celebrity endorsements, brand collaborations, and most importantly, full integration and optimization of ByteDance’s powerful recommendation engine. Chinese stars with massive fan bases such as Yang Mi, Lu Han, Kris Wu, and Angelababy opened accounts, joining in publicity campaigns, and a nationwide “Douyin Party” event roadshow was planned. Douyin had become the hottest upcoming app in China.
Which factors drove Douyin to succeed where Musical.ly had failed? Infrastructure Firstly, we need to acknowledge that regardless of anything ByteDance did or didn’t do, simply by launching three years later than Musical.ly, Douyin already enjoyed more favorable conditions for success. By 2017 fast, affordable, stable, and ubiquitous 4G internet had become widely available across China.
conditions for success. By 2017 fast, affordable, stable, and ubiquitous 4G internet had become widely available across China.
Kelly Zhang drew attention to four factors—full-screen high definition, music, special effects filters, and personalized recommendations.
Support from the mothership On the surface, Douyin’s relationship with ByteDance during early development was similar to “Instagram and Facebook” or “WeChat and Tencent,” i.e., a small, agile startup in a much larger established organization. Under no pressure to monetize early and spared the distractions of negotiating new investment rounds, they could simply focus on growth and building the best product. Remaining independent from their parent company, while also benefiting greatly from the access to technical expertise, money, and
Douyin (and later TikTok) had a founding team, but no actual real “founders” in the traditional sense. That’s because unlike most big western social media platforms, Douyin’s success wasn’t born from the vision of an individual; instead, it arose from a systematic process of experimentation within the organization.
ByteDance’s original three-pronged short video strategy when the company chose to build their own versions of three successful and already proven models—YouTube, Kuaishou, and Musical.ly. All three would plug into the company’s existing technology stack and big data – the most critical elements
ByteDance’s “powerful weapon,” 204 to quote the head of its AI Lab, was its content recommendation engine and a pre-existing database of millions of user-profiles and interest graphs. ByteDance’s core content
Similar to newsfeeds, short-form videos were a perfect match for this process. The user would typically swipe or tap the screen multiple times per minute, and each interaction revealed a little more about their preferences and could be used to further enrich their interest graph. In contrast, long-form videos provided much less data because people could watch a 45-minute episode of a drama series without touching the screen once. Even
Previously Musical.ly had wasted much time and effort mistakenly believing their app was first and foremost a social network. With Douyin, the company culture and technology stack was aligned with the actual true value of this format—a content platform. In a content-based community, content is more critical than people. Douyin was a recreation of television entertainment for the mobile age, not a new video first Facebook.
Douyin’s success relied heavily on algorithmic recommendation, which removed the need for social relations. While the social graph held some potential for improving those recommendations, it also brought with it negative aspects as adding family, acquaintances, or co-workers can inhibit the sense of freedom. Getting things right was a delicate balance. Users want to be comfortable about what they upload and who they follow—this often requires a degree of anonymity and disconnection from their daily contacts. Meanwhile, the content needs to be relatable and organic enough to feel a sense of belonging and participation.
By mid-2018, it was clear to the Chinese internet industry that ByteDance was now a threat to Tencent. WeChat met the need for communication, while Douyin fulfilled a need for entertainment. On the surface, these are entirely different, but they played very similar roles for their parent companies—sucking up attention and acting as distribution channels for other services. Traffic is king.
Yiming had a simple strategy for evolving ByteDance to the next level – hire or acquire the absolute best people and infuse their knowledge into the organization. To improve the company’s nascent recommendation engine, Yiming relentlessly poached top-level experts from Baidu. To start monetization, he headhunted one of the rising stars in traditional media advertising, Zhang Lidong. Similarly, Flipagram, along with other early deals such as news aggregators Dailyhunt in India and BABE in Indonesia, allowed Yiming access to vital local business know-how and expertise. ByteDance could pick the brains of experienced founders and quickly accelerate its understanding of local market nuances.
At the time, the best examples of domestic Chinese developers going abroad, aside from the notable exception of Musical.ly, were utility apps that held universal cross-cultural appeal and did not require localization. Examples included video editing apps such as VivaVideo and VideoShow, which had achieved impressive numbers with minimal investment. By the end of 2016, VideoShow claimed 100 million registered users worldwide, and monthly active users exceeding 11 million with zero marketing spend.
By the second half of 2017, Douyin had proved itself in China. Sophisticated technologies, including video analysis, AR filters, and ByteDance’s proprietary recommendation engine, had laid the foundations for rapid growth.
One lesson learned from Douyin had been that user-generated content apps first need to cultivate a committed group of local, high-quality seed creators that define the tone of the community and can generate memes for others to mimic. That took time. Rushing in and spending heavily on ads without any kind of community would be counterproductive.
The Tokyo team placed much effort into identifying and approaching suitable online influencers for the new platform. This group could create high-quality content, and build awareness, in addition to a part of their existing follower base converting over to the new platform. There were two kinds of influencers: celebrity stars and niche area KOLs (Key Opinion Leaders). Celebrities had broader audiences, usually measuring in the millions, while KOLs in niche areas, such as cooking or dance, possessed smaller but loyal and engaged follower bases.
The big problem was the gatekeepers—talent management agencies that controlled access to celebrities and the best KOLs. For TikTok, these organizations were an impenetrable fortress; no one knew TikTok, which meant no agency would take them seriously.
At last, a breakthrough came with female celebrity Kinoshita Yukina. 232 Once the operations team discovered she had become a user, they immediately contacted her representative office. Kinoshita enjoyed using TikTok very much and was open to collaboration, but her agency expressed strong reservations. “It took around six or seven rounds of discussions to seal the deal finally. The star studios in Japan are particularly prudent, so we need to talk to them time and again to familiarize them with our product and show our sincerity for cooperation,” 233 explained the then director of TikTok Japan.
Additionally, the operations team ran promotional accounts on other platforms. TikTok Japan’s Twitter account was registered in May 2017, making it probably the very first TikTok promotional account. 235 Videos posted reveal a similar content style to early Douyin, dance and lip-sync for young people. Some early adopters described discovering TikTok from Twitter and embracing the app because of its advanced video editing suite, filters, and special effects. Neither YouTube nor Instagram offered such diverse options making TikTok a useful tool to create videos to upload on other platforms. Each video’s watermark acted as a mini advert, helping drive downloads of TikTok. “TikTok comments mostly discussed how the video was filmed, and the shooting techniques, comments on Twitter will be different,” 236 noted early Japanese TikTok influencer Kotachumu, comments that reveal the strong video production focus of TikTok’s earliest users.
To address a widespread cultural aversion to individualism, the operations team emphasized challenges that allowed people to participate together in groups 238 and filters that could be used to make faces less recognizable, reducing self-consciousness and allaying concerns over physical appearance. Much operational expertise had been built up from Douyin that could be transferred over to TikTok Japan. This included a proven back catalog of highly engaging challenges that would generate online buzz, luring in more local stars and celebrities.
Standardized elements – universal across all markets Branding : the TikTok name, logo, and distinctive visual identity UX, UI: The core features and design, product logic Technology: recommendation, search, classification, facial recognition Localized elements – tailored to specific geographies and languages Content: the pool of recommended videos Operations: marketing, promotion, and growth Ancillary services could also be localized once the user base reached scale. These included： Commercialization: ad sales, business development Others: government relations, legal and content moderation Central to this system was the concept of regionalized content pools based on geography, culture, and language. 239 The core TikTok experience was the “For You” feed, which was localized for each market. Japanese users would not be recommended content from Indonesian accounts and vice-versa. Each country or region was essentially an isolated island. 240
“globalize products and localize content.”
For a Japanese user to find and view videos from a friend in Indonesia, they would have to use the search function, which allowed users to see videos from any global TikTok account or hashtag they wished to seek out. Yet only a small proportion of traffic came from people using search; by comparison, the “For You” feed was the app’s core experience and where almost everyone spent most of their time. Occasionally, content would be imported between countries to diversify categories and expose users to different video styles. People needed to be shown the possibilities and educated to create new forms of content. Given how Douyin and TikTok were separate networks, there was much scope for beneficial exchanges between the operations teams. When something became popular in China, the teams would judge if it was suitable to introduce into other markets. Vise-versa, something trending on TikTok, may be brought into Douyin.
Videos imported from Douyin and other regions could be used to kickstart a new region’s content pool. These were essential “teaching materials” to guide and inspire. The videos had been verified as enjoyable and reproducible, allowing users to imitate them easily.
Musical.ly had spent almost nothing on marketing. Instead, they relied on word of mouth and leaned heavily into fostering strong user communities. Groups of highly motivated super fans ran the company’s various social media accounts on other platforms and organized offline meetups, all without pay. Some users loved Musical.ly so much they just wanted to help. That kind of organic growth strategy was simply not viable in China, where the internet market was a relentless, unforgiving, dog-eat-dog world. The competition was brutal, the pace of change was rapid, and the cost of acquiring users had long been far higher than in overseas Western markets. To make it in China, Musical.ly needed to take a different approach and be willing to spend. The founders decided to base the China operations team in Beijing, away from their well-established Shanghai headquarter offices. A new team with a more competitive culture. Early June 2017 saw Musical.ly officially re-enter China. Ditching the old “Mamma Mia” name, they rebranded under the identity “Muse,” looking to leverage their already established reputation in the American and European markets. It did little to help. They were now a latecomer in a previously fragmented short video sector that was rapidly consolidating around the largest companies with the most resources. Co-founder Louis reflected: “I regret not (re)entering China earlier. Now the streets are shaking, making me extremely uncomfortable.” 241 Louis was referring to Douyin, the Chinese character “dou,” meaning “to shake.”
By 2019 ByteDance‘s revenue was estimated to be 120-140 billion yuan, approximately $17-20 billion, with Douyin accounting for $10-12 billion of that total, roughly 60%. ByteDance’s commercialization teams had brainstormed a multitude of ways to capture value, converting eyeballs into dollars and squeezing every last cent out of Douyin. Roughly 80% of Douyin’s revenue was attributed to advertising, including multiple formats, such as sponsored brand challenges and full-screen adverts that played immediately upon opening the app. Among the various formats, the lion’s share of revenue came from the “in-feed video ads.” These adverts took over the entire screen and played automatically, appearing like a regular TikTok video with just a tiny advert label located at the bottom of the screen next to the video’s description text. Above: Screenshot examples of Douyin’s various monetization models, including in-feed video ads, games, virtual gifting, and e-commerce. Marketers quickly worked out that if they made their promotion look like a typical user generated video rather than a professional, slick, polished advert, they could easily trick ad-adverse users into watching the first few seconds, enough time to get their message across. Accordingly, the format proved popular and effective. Above: the estimated 2019 revenue breakdown of Douyin. 245 Another monetization method is platform commissions from “Star-Chart,” 246 ByteDance’s official influencer data management platform. Brands that wish to work with influencers on Douyin must run their campaigns through Star-Chart or risk having the promotional video removed without notice. The platform takes a cut of all fees paid by brands to influencers. There are also revenues from live-stream show tipping and live e-commerce. “Extended businesses” included revenues generated through games, paid knowledge, and e-commerce. “Others” included revenues generated through blue tick account verification fees and “DOU+,” a system through which any user or creator can pay the platform to boost a video’s visibility. Above: Internet advertising in China by category. Short video became China’s second-largest advertising category behind e-commerce and exceeding search, a shift that was primarily led by Douyin. 247
Roughly 80% of Douyin’s revenue was attributed to advertising, including multiple formats, such as sponsored brand challenges and full-screen adverts that played immediately upon opening the app. Among the various formats, the lion’s share of revenue came from the “in-feed video ads.” These adverts took over the entire screen and played automatically, appearing like a regular TikTok video with just a tiny advert label located at the bottom of the screen next to the video’s description text. Above: Screenshot examples of Douyin’s various monetization models, including in-feed video ads, games, virtual gifting, and e-commerce.
Several popular social elements were removed, including Musical.ly’s leaderboard, a feature that displayed the most popular videos posted each day by country. The feature gave a sense of community to the young users, and its removal was unpopular. The leaderboard centralized attention towards the most popular video categories, which for Musical.ly was teenage dance, meme, and lip-sync videos. This didn’t fit with TikTok’s goal to age up the user base and encourage diversification of content categories. The signup process was significantly streamlined, removing the need to register or log in with an existing account. New users only needed to select from a list of interests (e.g., animals, comedy, art) and start watching videos within seconds. 263 The app would use a “shadow profile” based on the device ID, which allowed for personalization of content even for those without registered accounts. Asking someone to create an account before they even knew if they liked the app was putting the horse before the cart. The new process allowed people the freedom to experience TikTok without committing. Video sharing was aggressively encouraged. Once a clip had looped several times over, indicating that the viewer found the video interesting, an attention-grabbing “share” icon would flash. Videos shared to other platforms now contained the flashing TikTok watermark. The logo was difficult to ignore as it alternated from one corner of the video to the other, continually vibrating. The single most significant change was switching over to the same backend infrastructure used by Douyin. For users, this change was expressed as the main “Featured” feed being replaced by the new “For You.” The two titles accurately summarized the difference, “For You” was entirely personalized using sophisticated machine learning technology. “Featured” was Musical.ly’s old system, which used a combination of less advanced recommendation mixed with videos manually selected by the content team. Time spent in the app was said to have doubled after switching the backend over to Douyin’s system. “The content we wanted was in the app, but it was hidden by our initial architecture. Once we changed this, content diversity followed in droves,” explained James Veraldi, former head of product strategy at Musical.ly. “I asked some (ByteDance) friends who had some data on the before and after. The step-change in the graph was anything but subtle,” 264 revealed former Hulu and Amazon employee Eugene Wei.
This single change was transformational, an effect which echoed that of YouTube’s introduction of the Sibyl machine learning backend system in 2011, “the content is already there. We have billions of videos. ” explained YouTube head engineer Cristos Goodrow at the time. Use of machine learning to classify and recommend videos had been the key to unlocking the potential of both platform’s vast pools of content.
TikTok was bizarre. An endless stream of people posting weird content with almost a total lack of self-awareness. Mindless comedy skits, lip-sync, and just outright wacky oddball creations. The kids making these videos could be forgiven; they were just kids. But the adults posting on the app came off simply as creepy and weird. Countless numbers of TikTok cringe compilations started appearing on YouTube, many with millions of views. Criticism of the app became widespread, with the shaming of TikTok users becoming a regular occurrence on Twitter and Reddit.
In China, Douyin had first garnered attention as a popular app for urban youths, associating itself with art students and fashionable hip-hop lovers. Yet in America, it was the absolute opposite. TikTok had entered the public consciousness as a cringe app for losers and misfits. What was going on? The answer was ByteDance’s truly massive advertising campaign across major Western social media platforms such as YouTube, Instagram, and Snapchat. The advertising campaign’s budget was reported by the Wall Street Journal to be over $1 billion in 2018. 267 ByteDance became Facebook’s biggest Chinese customer as it grew TikTok’s footprint with app-install ads. 268 Many Americans suddenly found TikTok ads were everywhere they looked online.
Yiming was never one to keep to conventions. When buying his first apartment in Beijing, rather than consulting with real estate agents, discussing with family or personally visiting housing compounds, he found a short-cut. Yiming crawled the web for data, scraped everything into spreadsheets, and crunched the numbers in a single evening. When it came to advertising TikTok and newly acquired Musical.ly, ByteDance found a similar shortcut, but the strategy was somewhat unorthodox—it would simply use videos from the app itself. The platform’s terms of service gave it the right to do so.
When it came to advertising TikTok and newly acquired Musical.ly, ByteDance found a similar shortcut, but the strategy was somewhat unorthodox—it would simply use videos from the app itself. The platform’s terms of service gave it the right to do so.
After manually identifying and removing potentially inappropriate content, the company implemented a systematic process to experiment with various videos. 271 The adverts didn’t actually say anything about what TikTok was or why anyone would want to use it; they simply needed to pique people’s interest. The goal was simple—find the clips that got the most people to click on a big blue “install” button.
This ad buying process was run from Beijing by the company’s experienced growth hacker teams. There was just one issue—the teams had a laser-like focus on conversion metrics but little understanding of the actual video content. Whatever converted best would be used more regardless of what the actual video showed. It turned out that wacky, outlandish, downright weird videos worked really well at getting people to click big blue “install” buttons. Many of these weird ads were attracting social misfits. When these people started using TikTok, they, in turn, made strange videos that would attract more social misfits and so on. TikTok’s video classification systems were highly sophisticated and able to accurately identify and classify all kinds of subculture content—automatically. The system was also able to tag users more effectively based on their actions and precisely match them with content in a way that Musical.ly had never been able to do. A prominent example were “Furries,” a stigmatized and misunderstood community of people who derive enjoyment from dressing up as animal characters in large fursuits. 272 Furries were big early adopters of TikTok in the U.S. 273 Many built significant followings as the colorful cartoon-like animal costumes proved attractive to the app’s large pre-teen user base, bringing the sub-culture to a new audience. Other notable early TikTok adopter communities included cosplayers and gamers. The animosity between these groups led to the “Furries Vs. Gamers War” 274 meme, a lighthearted imaginary conflict which saw gamers pretending to have been kidnapped by furries and roleplaying acts of espionage, feigning to have infiltrated the ranks of the furries. Above: TikTok was acquiring new users at a much faster rate than Musical.ly. It was then accurately and efficiently matching those users with niche content based on their personal preferences in a way that Musical.ly never could.
TikTok contained a “duet” feature, which allows two videos to appear side by side, splitting the screen. Duet had previously been restricted in Musical.ly, but now users could respond to any video by recording one of their own. With many weird niche subcultures like furries on the platform, “duet” became popular, quickly transforming into a bullying and harassment tool. As a countermeasure, settings were later added, allowing users to disable duets.
Since merging Musical.ly with TikTok in August 2018, the platform was moving in a vastly different direction – not everyone was happy. “TikTok’s early (unintentional) positioning in the states basically was cringe.” explained an early TikTok employee who wished to remain anonymous. The app had an awful image problem. It was widely perceived as being only for misfits and kids making lip-syncing videos.
Due to the massive influx of users driven by the big-spending on ads, creators found it easy to grow large fan bases quickly due to the imbalance between the supply of and demand for good content. Instagram, YouTube, and others were saturated with people competing for attention. TikTok was wide open and began to attract its own batch of content creators and online marketers—eventually, those looking to gain attention online will always follow the numbers. The dynamic was similar to the metaphor Alex Zhu had used years earlier with Musical.ly – to encourage immigration to your new country, “some people need to get rich first.”
Old Town Road Montero Hill, an unemployed 19-year-old with an empty bank account, was sleeping on the floor of his sister’s house. He had dropped out of college to pursue his dream of becoming a famous rap artist, recording tracks in the closet of his grandma’s house and releasing the music on audio distribution platform SoundCloud under his artist title “Lil Nas X.” Atlanta native Montero came from a humble background; his parents had divorced when he was six, leaving him to be raised by his mother and grandmother in a run-down public housing unit. The one thing Montero had going for him—he was a skilled and tenacious online marketer—having mastered the art of producing viral Twitter posts. 284 He spent several hours each day online promoting himself and his songs. Yet despite these efforts, his dreams of fame and fortune had failed to materialize. The material lack of success contributed significantly to his perpetual state of anxiety, daily headaches, and insomnia. 285 Searching YouTube one day in late 2018, Montero discovered a catchy banjo sample 286 from a young producer in the Netherlands, also working out of his bedroom. Straight away, he sensed the track was exceptional and purchased the rights to the music for $30. Montero set about creating a unique blend of country music-inspired lyrics to accompany the backing track. The result was a novel combination of disparate music styles – a heavy bass sound juxtaposed with strumming banjos and Montero’s playful lyrics “Cowboy hat from Gucci, Wrangler on my booty.” He named the track “Old Town Road” and termed its genre as “country trap.” Taking advantage of a special deal at a local recording studio, he spent just $20 to rush out a recording in less than an hour. Montero began resolutely promoting the new track online. He created cowboy-themed Twitter memes that tapped into the resurgence of interest in cowboy culture driven in part by “Red Dead Redemption 2,” the most popular game of the past year. The outstandingly catchy track had become a moderate success, yet after two months of relentless effort, interest in the song began to wane. Everything changed on February 16th , when the track was picked up by a small TikTok influencer, Michael Pelchat, who created a video of himself dressed as a cowboy dancing to the song. 287 A meme quickly formed around the track with people transforming themselves into dancing cowboys at the exact moment the beat of the groove kicked in. 288 Interest around the song exploded with millions of people using it as the background sound to make short videos. “It went crazy!” exclaimed Pelchat, “Everyone was dressed like a cowboy for like three weeks.” 289 TikTok’s meme virality superpowered the song’s popularity. Interest across America snowballed so quickly that radio station networks resorted to using an MP3 audio file ripped from YouTube to get the track on the air. 290 TikTok had propelled Old Town Road to being a phenomenon, the hottest song in America at a time when Montero hadn’t even signed with a record label. “Old Town Road” ended up as one of the most successful songs of all time, winning multiple music awards. The track became the longest-running number one single in Billboard Hot 100’s 61-year history, ranking 19 weeks at the top. Montero was signed by Columbia Records and later recorded a version with country music legend Billy Ray Cyrus. Reflecting on his fairytale success story, Montero was in no doubt as to the cause, “TikTok helped me change my life.” 291 The fairytale story speaks to the critical importance of content distribution. Even an exceptionally savvy marketer like Montero found distribution on saturated social media platforms such as Twitter, Instagram, and Facebook an uphill struggle. In a world of abundant music where the barriers to creating professional quality tracks were so low—Montero had spent just $50—it follows that merely creating an exceptional quality song is no guarantee of success. Instead, “cutting through the noise” is often the deciding factor. One wonders how many other “Old Town Roads” are out there waiting to be discovered? The power of memes Old Town Road was the best illustration yet of the exceptional power of music-driven, user-generated video memes. Memes drastically lowered the creative and motivational barriers to content creation, providing a cookie-cutter structure allowing anyone to take part. These memes were referred to commonly as “challenges,” a term that explicitly communicated their participatory nature. Memes were undeniably a critical driver of TikTok’s success. The videos typically came across as frivolous fun and resistant to logical analysis. Yet, with enough exposure, formulas begin to emerge. Similar to how all storytelling could be reduced down to seven basic plotlines, 292 video memes could also be categorized into a fixed number of genres. Reveal memes involve a short set up, followed by a dramatic transformation or reveal following the structure of the accompanying song. The set up happens during the song’s introduction, with the reveal beginning at the exact moment the song’s main riff or hook kicks in, amplifying the dramatic effect—a mini-story sequence compressed into 15 seconds. Several of the all-time most popular video memes are reveal memes, such as the “Don’t Judge Me Challenge” that originally powered Musical.ly to top the U.S. app download charts in 2015. Other prominent examples include the iconic “Karma’s a Bitch” (Douyin, 2017) and “The Harlem Shake” (YouTube, 2013). Above: Examples of the various video meme formats. Dance memes involve mimicking a set sequence of novel dance moves or hand gestures that accompany the lyrics or beat of a song. The portrait ratio of vertical video lends itself particularly well to capturing an individual dancing. This category is widely accessible and popular for teens recording themselves in bedrooms with the ultra-low barrier to take part being merely to move your body or mouth lyrics. Musical.ly’s first set of popular teen influencers included many, such as Baby Ariel, who mastered the art of creating hand gestures to accompany song lyrics. Challenge memes involve completing a difficult, unpleasant, or skillful task. An early example of this format was the “Ice bucket challenge” (2014), which saw celebrities record themselves pouring buckets of iced water over their heads. The famous “Bottlecap challenge” (2019) required twisting the cap off a bottle with a kick. In China, the “A4 Waist Challenge” (2016) saw women prove their slenderness by covering the thinnest portion of their waists with nothing but a piece of A4 paper. Filter memes are based around the use of a specific special effect. ByteDance quickly realized that users would adopt innovative and fun AR video filters to create memes. A newly released filter could completely solve the creativity problem of making a video enjoyable for others to watch. On TikTok, the filter would, much like a hashtag, act as a channel for content discovery, meaning those who adopted popular filters early had the chance to gain high exposure and many new followers. An instructive example was the popular “Mirror reflection challenge” 293 (2020). The filter simply mirrored the left half of the screen onto the right side. Users quickly experimented and learned to align their faces to create a multitude of exciting effects and reveals. Concept memes are precisely that – a concept. One that is novel but sufficiently replicable by others who, in turn, may add their own twist. An early non-video example was the craze for lying face down in public places, termed “Planking” 294 (2011). The “Mannequin challenge” 295 (2016) was another notable example. Those being recorded would remain frozen in action, like mannequins in a shop window, while a camera panned over them. The overall effect was as if time had suddenly stopped. A noteworthy concept meme was the “Gummy Bear Challenge” 296 (2019), started by Czech TikTok user David Kasprak. The iconic chorus of the Adele song “Someone like you” plays as the camera slowly pans over hundreds of small gummy bears giving the impression they are the ones singing the song. Legions of bears wailing in unison “Never mind, I’ll find... someone like youuuuu”—their collective sorrow is palpable. The video is a mesmerizing 15-second artwork, all the more notable for its lack of a human protagonist. Popular video memes straddled the “goldilocks zone” of difficulty. Too easy for others to replicate meant they quickly became tedious, too complicated for others to copy, and they could not spread. Prior exposure to a meme facilitated the mental processing of new variants as it provided a familiar structure into which the new information could be integrated. Music was the most potent trigger for making these mental connections. The human brain is naturally attuned to detecting patterns in its environment. The second Old Town Road started playing on a new TikTok video, the viewer connected what they were seeing with all the other Old Town Road videos they had experienced. It immediately established familiarity and set up expectations of what would unfold. Old Town Road was a particularly potent combination—an accessible dance meme, matched with a “reveal” of people switching to cowboy-themed clothes at the instant the song’s chorus began. Professionally generated music videos had become established with the rise of television channel MTV in the 1980s. Smartphones and memes were helping to form a new category of “user-generated music videos.” Some artists even began to write lyrics to new songs with short video-friendly hand gestures in mind.
Where’s the moat? Such easy onboarding and lack of reliance on a traditional social graph left many people struggling to pin down precisely what was TikTok’s “moat.” What was stopping its deep-pocketed, established competitors from moving in and eroding its market share? It turned out that American companies competing with TikTok would hit exactly the same barriers that Chinese companies had met in competing with Douyin. Building a basic TikTok clone and gaining a small market share was easy. Creating a superior version of TikTok, once it had become established in a market, was exceedingly difficult. Only the largest internet companies had the resources to realistically match TikTok’s advantage in automated video classification and content recommendation systems. The technology allowed for superior matching of long-tail content with users, which further amplified the already existing advantage in content. The more time users spent in TikTok, the more detailed their user profile interest graph became. Put simply—the more you used TikTok, the more personalized it became. This dynamic ensured the early experience of using any clone would always be inferior.
Fostering a healthy ecosystem of creators required three things. Firstly, users need to convert into becoming creators. Secondly, creators need to be able to find their audience and build a following. Lastly, they need a way to monetize that following either directly or indirectly. TikTok had considered all three steps. The art of converting users into creators was a matter of heavily amplifying the use of memes as a framework to inspire and motivate. Videos could be migrated between regions to be used as “educational materials,” showing users what was possible. Backing that up was best in class, intuitive to use video editing tools and a constant pipeline of new fun filters. Due to the high new user growth through paid ads, building a following on TikTok was easy. Multiple stories appeared online of people rapidly gaining large followings with what could only be generously described as “very average content.” This dynamic was by no means unique to TikTok; the rapid rise of a new content platform will naturally spawn a new crop of influencers who grow along with the platform due to the temporary imbalance between supply and demand of content. Well established older platforms such as Instagram suffer from the opposite dynamic, as they were saturated with accounts producing high-quality work competing for attention. Above: The three-step process of converting users into long term quality creators. Monetization was the most challenging step. How does one extract value from such massive amounts of attention? For most people, regularly creating high-quality video content was a full-time job. Without a sustainable way to make a living from their account, creators would become frustrated and move their following over to other platforms, or simply stop creating. Many influencers found indirect ways to work with brands to do endorsements and soft ad promotions. To facilitate this further, TikTok set up a marketplace matchmaking brands with creators 302 and began to test monetization features that had proved effective in China, including embedding store links within videos so people could buy what they saw. TikTok also adapted a tried and tested China strategy, setting up a “Creators fund” to subsidize influencers who met specific criteria. In 2020, they announced a fund starting with $200 million, growing to $1 billion in the U.S. over three years. 303 To manage this fund and the many other initiatives, TikTok’s U.S. team needed to expand considerably. Onboarding so many employees so quickly was perhaps the biggest challenge of all.
In a letter 304 to staff marking the company’s eighth anniversary, Yiming publicly disclosed the company’s 2020 target to reach a hundred thousand employees. This goal would see the company headcount leap ahead of Facebook and Tencent, with the lion’s share of new hires being outside China.
The first significant American hire was Vanessa Pappas in February 2019. 306 Vanessa had spent seven years at YouTube, working her way up to become global head of creative insights and specialized in growth tactics for influencers and celebrities. She became general manager for TikTok in the U.S. Four months later, Facebook vice president of twelve years, Blake Chandlee, jumped ship over to TikTok. When asked why he joined, Chandlee replied, “the metrics on TikTok were astounding.”
By late 2019 ByteDance had already begun to set up shop on Facebook’s doorstep. Moving into an office space previously occupied by messaging app WhatsApp, ByteDance wasted no time poaching Facebook staff offering salaries as much as 20% higher. 308 News broke in May 2020 that TikTok had rented 232,000 square feet of prime office space at Times Square, New York.
In May 2020, news broke of his latest catch – Walt Disney executive Kevin Mayer was joining ByteDance. Mayer’s hire sent shock waves through the American business community at a time when awareness around TikTok’s parent company was still low. It was a marquee signing of a top-level business leader from one of America’s most respected companies. Mayer’s title was CEO of TikTok and COO of ByteDance, with China reporting directly to Yiming.