Tim interviews Mu Chen, CEO and founder of BigOne Lab, a tech-enabled investment research firm in China that provides alternative data solutions to institutional investors globally. Since its founding in 2016, BigOne Lab has built data products from a variety of data sources to track operation metrics of some of the most followed companies in China. Mu shares how he leveraged his previous work experience to capture a growing industry in China, understanding the fragmented alternative data landscape in China, how to value alternative data as an institutional investor, customer maturity and future industry tailwinds and headwinds.
[Editor's note: this interview has been edited and condensed for clarity. The opinions expressed in this article are Mu's own and do not reflect the views of BigOne Lab; Tim Chen is Head of International Product at Mobike and Partner at TheHarbingerChina. Zoe Li is a summer analyst at Miss Fresh and Master's Student at Johns Hopkins SAIS]
Today I’m here with Chen Mu of BigOne Lab. So Mu, tell us a bit more about your background and what BigOne Lab is, and how you came to found this opportunity.
Mu: Sure, so I’m the founder and CEO of BigOne Lab. I started the firm in 2016 after I came back from New York. Before starting the firm I spent a few years working in banking and then at a data startup in New York. How I came to find this opportunity; after spending a few years in banking I saw much of the work is pretty manual and pretty repetitive. So about 3 years after graduation from college I started looking at new opportunities and came across this startup called YipitData in New York. They are a data company that helps hedge funds track US-publicly listed companies using web scraping to collect information about companies, aggregate it, analyze it, and deliver it to hedge funds that are invested in those companies. During my years in YipitData I helped them build trackers on Alibaba and JD. That was in 2014.
Yes in 2014 Alibaba IPO’d as well, which ushered in a wave of future Chinese IPOs on US stock markets.
Mu: Exactly, at that time it struck me that we got a pretty good run with data; we tracked the GMV of JD and because JD was new to the market, investors were very interested in the data. So we had a good run in terms of revenue and sales of the product. We call the package a product and so that struck me ok a few points: one, a lot of web scraping information collection is based on using machines to do it, so that changes the efficiency and amount of information you can collect and process; two, you can empower investors to obtain new insights from this new channel.
So in 2015 and 2016 I came back to China four times in total I spent about two months in China. One observation was that more and more new economy companies were popping up. Basically new economy companies are those that have some sort of connection with the internet. As a result a lot of their data is being digitized and exposed to the data world. Another observation was that the Chinese economy is becoming more and more connected with global capital markets. Given that global capital markets are mainly driven by institutional investors who use more methodical and scientific ways of investing and researching, there’s more demand for data on China so that’s where I thought of starting a startup that discovers, collects and aggregates data about new economy digital companies in China and provide it to global capital market investors. So that’s what we have been doing for the past three years. We have a team that explores what types of raw data is available out there and then we connect them, onboard that data, and have our own proprietary algorithm to process that data and provide to investors in a way that is digestible to them.
When you were looking at this industry, what did you notice about the data you were collecting? What kind of feedback were you getting?
Mu: The reason I was getting more and more convicted about this...during my years in New York working on data industry, I saw the number of alternative data and data types were increasing, and the number of firms productizing this data was growing rapidly. Before I joined there were about fewer than 50 firms, and after two years there’s north of 300 firms. More and more data types coming out – credit card data, geolocation data, email receipt data, satellite data, and sentiment data. The proportion of useful data is probably not very saturated, I think at that point I was seeing a blooming market where some data was useful some data was not useful but in general we saw the investors, the general client base, was spending more and more on alternative data. So that was a trend I saw. If I map that trend onto China, I saw China was three to four years behind the US so thought it was a good time to come back. Even at this stage, there is probably fewer than 20 or so firms in China but given the size of the economy, China compared to the US, the types of data firms and alternative data than can be productized should rise on a comparable scale.
With respect to alternative data products for our listeners who are not too familiar, you have Bloomberg who covers equities, you have Crunchbase and CBInsights who cover more private tech companies and their counterparts in China is Itjuzi and then you companies like YipitData and Foursquare who mine this public data online such as web scraping or Foursquare looks at your check-ins online and then they sell this data to funds. BigOne Lab falls into this last piece. Can you tell us more about your data sources?
Mu: So this is a pretty interesting time because I’m formulating a new version of our strategy. Our learning is in the past 18 months that alternative data is still growing and relevancy of alternative data itself alone may be very important to an investment but not as important as the core moving needle for an investment decision. So we’re taking a step back and truly understanding what is the user paying us for eventually it is a channel to provide them with information that may be valuable for their investment decisions. So we’re formulating a high level strategy where we want to be a tech-enabled investment research product firm. What do we mean by tech-enabled? Alternative data itself is very tech-enabled from the very beginning where data is generated and the end where the data gets delivered. It is pretty much machine-driven with some human component. Some of the more traditional ways of research are like surveys, expert networks; over 60-70% of those processes are human and manual so there’s room for us. First of all our clients still need those products because there are gaps of information where alternative data cannot fill. But those types of research products are manually driven; our vision is thinking about how we can apply technology to increase the efficiency of those products as well, so using that we can have a more integrated solution for our clients. So that’s where we are given the maturity of the market and maturity of the alternative data sectors, a lot of data is still being explored and developed into full products so we can’t spend all of our resources on younger types of data sources where the risk of not being relevant is high.
When you say not being relevant do you mean it is not timely; not structured; it’s hard to figure out where the important indicators are?
Mu: There is a limitation of resources for example payment data in China – there is limitation of supply. A good quality of supply is it has high coverage of target population and its coverage is consistent. So those two criteria are pretty high for data sources. For example the payment market – it is younger; it is less mature because it is fragmented; third party payment is very fragmented in China but mobile payment is very concentrated e.g. Alipay, Wechat pay, can get more than 90% of market share; we don’t have access to this data so we rely on third party service providers as a pass-through gateway for those transactions and as a result, they will have visibility into mobile payment behavior. But that market is very fragmented and for us to aggregate enough coverage is very difficult, we have to talk to enough data sources, go through onboarding, compliant processes. Thus it’s hard to get a good supply of transaction data.
Source: BigOne Lab
Customer, Supply, and Use Cases
For companies like YipitData who are processing this data and sending it to hedge funds, and Foursquare who pull raw data, my understanding is that a few years ago hedge funds wanted this data at the raw level so they can manipulate it themselves. How do you think about this with respect to your product? When you collect and aggregate from multiple sources, how does BigOne or the industry think about layering types of processes on data before selling to the client?
Mu: That is a good question. It depends on a few things from the clients. One is scale, second is the level of sophistication of the client. On scale it depends on AUM; we’ve come across the larger the firm is the larger the budget they can spend on research, so they can afford to build their in-house capability because they have the resources to make that happen. So that’s number one. Number two, sophistication, basically we have two types of clients: one is funds that have some sort of data team in house, and then those that don’t have any capability. Our belief is in the future funds need to have data processing capability whether you are a fundamental-driven fund or quant fund, or even private equity and venture capital because at the end of the day the fund investors make money by taking advantage of information asymmetry.
Source: BigOne Lab as of Apr 2019. Ant Markets, conceptualized by Xiaomi, are markets with lower concentration, smaller market size and more players. There are no clear leaders, enabling winner-take-all dynamics.
So the key point is information, so we believe that somehow in the future funds always have some level of data processing power. At the current stage we face two types of clients. So for the first type that don’t have much data capability we’ll build a more finished product and we’ll deliver our data in a dashboard and a data visualization format so it is similar to a Bloomberg Terminal. And for the second, our clients have data capability, depends on what type of data capability they have we’ll provide different level of processing on top of the data. Our learnings from conversations with clients is very few want super raw data. Some extent of processing is required, just to clean up the data and make it accessible. So depending on the level of processing our value proposition is different.
BigOne has been operational for a couple of years now; the BigOne product tracks 200 companies over 10 sectors across data sources such as payments, geolocation, and some other types. How are you able to capture this breadth across Chinese users? What does the market look like on the supply side? And how do you think about the supply market landscape?
Mu: On the supply side that’s what is interesting and attractive for startups in our position, that’s why I came back. On the supply side there are many many types of alternative data. We have talked to about a dozen so far and productized three types of them. We just don’t have enough resources to exhaust all types of alternative data that could be useful for research. We have productized to date web scraping, mobile carrier, geolocation, and looking to grow our supply on payment data to find a good payment data supplier. There are many really interesting data sets such as weather data; there’s a company that’s tracking vehicles across China, inventory management system for different brands...so these guys accumulate a lot of data and a significant amount of those supplies are interested in productized or make use of the data so that gives us the opportunity to talk to them and see if we can onboard them as a supplier. And the other situation with them is that they have very little knowledge about strategy, research, investment, so there’s definitely a cultural, knowledge, and expertise gap for them to go directly to our user so they go to someone like us.
Source: BigOne Lab
What do you think the market size is in China for alternative data?
Mu: So we size it by around 2023, next five years, the market will grow to $200-$400 million per year for alternative data. And we think the bigger market is investment research, strategic research, and product itself is a $10 billion market. So that’s a bigger market in China. Because investors always ask what is our ultimate goal? Ultimate goal is to be a tech-enabled research product company for the global market. The global market we estimate to be around $200 billion. Primary market size is smaller when we size our market, the bigger market is global secondary market investors as well as corporates that are doing strategic research.
Over the next five years what do you see as major tailwinds or headwinds in the alternative data market? Is there regulation risk?
Mu: Regulation risk will significantly increase the cost of acquiring data. I see the trend of regulation is to define data as an asset for individuals and then build protection around those assets. So by classifying it as an asset we’ll have to pay for it. So regulation will make it more costly to acquire, and as a disclaimer we don’t take in any individual data, when we take from a supplier we have them aggregate it on their premise before sending it to us. So one is regulation, another headwind is market risk. If the economy has a downturn then investors may not have budget to buy data research.
For tailwinds, there are still hundreds of Chinese startups, unicorns, more than 50 unicorns, lined up for IPOs. Once they go public, investors will need data to understand those companies. Another one is on the corporate side, as the economy slows down and cost of capital increases, corporate executives need to be more thoughtful about where they put their capital and resources, so they need to make a more informed decision with more scientific methodology. You can’t just wing it – try this and try that, because your cost of trying is higher now in this economy than two years ago when you can easily raise hundreds of millions of dollars.
Under these tailwinds and headwinds, for your product, where is the competitive advantage vs. the other players in the alternative data space?
Mu: Our focus is to serve top end clients. Our differentiation is that we are likely one of the most professional firms with the most data expertise in China. So for our clients who seek a full solution on alternative data, we are the go-to place. Our brand is pretty good in our community even though we’re just three years old; we go to investment bank conferences – Morgan Stanley, JP Morgan – we got invited to speak as data experts. We got invited by expert networks as well. So our in depth understanding of how to use alternative data in a valuable way for research is what makes us different from our competitors.
When you say high end, you define it as the wideness in scope of supply you can source from, how deep you drill into each one and also how you package it into something that is meaningful.
Mu: So we have one of the most complete supply of data sources, the expertise to process all these different types of data sources. So if you want to solve a problem with alternative data we are best equipped to help. So we also have the most supply and the know how to make use of it for investors. So for clients who are looking to make use of alternative data and just to have a view of what is going on and types of alternative data we are the go-to place.
Source: BigOne Lab
What is one thing you learned about the alternative data market in China that you think others are overlooking?
Mu: I think one big thing is even though it is a concept from the US, the types of alternative data is very different – the characteristics and how to use them in China is very different even if the data types are the same. The difference is in applicability and use cases. There’s geolocation data – you can’t use it to track brand performance in China. It’s harder. In the US you have big parking lots so you can track it. Just knowing what the type of data is not enough for users, the key is to understand how that data is generated and the contextual knowledge for how that data is generated.
Source: BigOne Lab as of Apr 2019. Example of Starbucks vs. Luckin footprint in China. However, one cannot compare Luckin with Starbucks on purely a geolocation basis without understanding user behavior and product usage.
So if I’m a fund looking at Chinese stocks, maybe in the US I would look at payment transaction data exhaust as an indicator of sales but in China that might not be the most relevant metric because I don’t understand how that data was generated and what those underlying users are like.
Mu: Exactly, you may use payment data but not realize the market share of UnionPay (Chinese payment network similar to Mastercard or Visa) is smaller, so you have to prorate. So that kind of contextual knowledge is missing in translation.
What type of investor or strategic partner would be helpful for you?
Mu: I think the window of consolidation in Chinese tech-driven research market is coming earlier than I expected. I was expecting the wave of consolidation kicking in late next year but the next 6 months or 9 months there may be a wave of consolidation, players offering different products, making alliances and better serving clients. And so we are hoping to have enough dry powder to catch the wave – either we’ll be leading the alliances or we’ll be lead. So we’re looking for partnerships or investors that one believe in alternative data is a future key component for research processes and two have strategic resources for us upstream, downstream, or cross section meaning someone that has an interesting set of data or channels for us or complement our products. Or just a financial investor who can provide us with access to different financial institutions for expansion.
Great, this has been very insightful. I wish you the best of luck!
We just sent you an email. Please click the link in the email to confirm your subscription!
OKSubscriptions powered by Strikingly