Ed Huang first developed an interest in engineering at the age of 12. After finishing his undergraduate degree in software engineering, Ed worked for NetEase and Wandou Labs, two large Chinese internet companies，as an infrastructure engineer. While at Wandou Labs, Ed experienced firsthand the difficulty of scaling databases manually as the amount of data increases. Seeing a huge opportunity, Ed co-founded PingCAP in 2015 with a vision that there should be a new kind of database to tackle the problem of big data storage. Besides his day job as the CTO of PingCAP, Ed is also a passionate open-source advocate and tech blogger.
Fei: Thanks for joining us today Ed! Why don’t we start out by helping the audience better understand the broader industry PingCAP plays in?
Ed: PingCAP is a software company. Our product is a database called TiDB, similar to Oracle’s. But under the hood, TiDB is totally different from traditional databases like Oracle or MySQL. TiDB is designed for big data and its distributed architecture makes it possible to scale to any size. The highlight is that TiDB still has a standard SQL query interface like traditional databases, which makes the migration cost to TiDB virtually zero for applications that already depend on traditional databases.
In my opinion, infrastructure software is just like water and electricity – it’s industry agnostic. As businesses become more data-driven, solving database scalability problems is also increasingly important.
Databases are a type of standardized software, and SQL is the standard interface for developers. Because PingCAP doesn’t need to do customizations, it is much easier to scale our business. As of today, we’ve done Proof of Concept trials for over 400 companies in China, and more than 30 customers have already used our database for more than 3 months. This includes Internet companies as well as those in the gaming, fintech, and banking sectors. I’m proud of the traction we’ve gained thus far, but there’s also a lot more we want to accomplish.
Fei: Infrastructure software is a difficult business with only a few Chinese players at the moment. What inspired you to launch PingCAP?
Ed: First of all, our three co-founders are all engineers. We love building systems like this and fortunately, we’re quite good at it. We realized that there was a huge pain point in the infrastructure software business, and thus the challenge – as well as the opportunity – was becoming the first company to achieve de-facto standard status.
The biggest difference between us and other Chinese players is that we fully control the project. Most Chinese infrastructure solution vendors are system integrators, which that means they use open-source software, make some customizations, then sell the product to their customers, but they don't have control over the project itself. This is totally different from our business model.
I think there’s a big opportunity in China’s enterprise solutions market because the cost of labor has risen rapidly in recent years and this will likely continue into the future. As a result, companies paying more for advanced software to solve problems, not just throwing more people at it, will become the mainstream solution.
Fei: What are some challenges PingCAP is facing in terms of tech, business development, and talent acquisition? Why are you looking to open an office in Silicon Valley?
Ed: I think the biggest challenge is talent acquisition. In our industry, a single good engineer is ten times more productive than an average one. That’s one of the reasons why we open-source software and why we put effort into evangelism – we want talented engineers to know us and understand what we’re doing. In the infrastructure software space, the number of quality database engineers is much higher in the US than in China. We strategically opened an office in San Francisco because the location attracts top talents.
Fei: What is PingCAP’s business model and how do you monetize?
Ed: We offer our clients two product options. The first is a traditional enterprise license model, where we have a core open-source product and offer additional on-demand proprietary services built around it, such as a user dashboard, inspector, security toolkit, automatic deployment, and rolling updates. The other model is a subscription model for our services. When we sell a subscription, the buyer is buying not only support, but also access to these tools without limitations.
The monetization in the license model is based on the number of nodes and the data scale, whereas in the subscription model, cluster size is not limited. We encourage users to choose the subscription model, because the more data we have in our database, the greater the collective benefit to all our users.
The Infrastructure Software Market
Fei: What are some of the vertical industries PingCAP is in, and why are they early adopters of the technology?
Ed: We’re industry agnostic, but for now, we’re focusing on internet companies such as gaming, fintech, and online services. Most of them heavily depend on MySQL and suffer from MySQL's scalability problem, so they have strong motivation to shift to our solution. In the long run, we plan to also focus on the financial industry, specifically sectors such as banking, securities, and insurance.
Fei: What are the characteristics of these segments?
Ed: At different stages, we focus on different metrics. At the early stage, such as right now, the product itself is the most important factor. We believe that real-world workload will help us improve the quality of the product, so at the moment, we are focusing on internet companies for four main reasons:
1. They’re already using MySQL, so their cost of adapting our system is much lower;
2. They have shorter decision-making chains, so we can put our system online very quickly - time is important for us;
3. Their workload is very heavy, which helps us to improve our product;
4. Compared to traditional industries, internet companies are more willing to share their solutions with the tech community, which helps us to convince more companies to use our product.
Fei: What are some trends in the infrastructure software space?
Another trend in the infrastructure software industry is that more customers are more willing to pay more for better products. I think this is because the labor cost is rising rapidly.
The Open-Source Developer Community
Ed: Well, there’s no magic. First of all, everyone uses databases. Second, we take marketing to an international developer community very seriously. Although all of TiDB's original developers are Chinese, we insist on writing all our documentation and code comments in English.
We announced our release on international platforms such as HackerNews, Reddit, and Google Groups, as well as our own personal social networks like Twitter and Weibo. We’ve also intentionally linked ourselves with the MySQL community, which works well because we're compatible with MySQL. We also hired technically-skilled English writers who not only translate documents, but also maintain our social media presence and gather feedback from our user community.
In the open-source world, communication is more important than the code itself. As I’ve stated many times before, open-source doesn’t just mean that the source code is open, but that the entire community is inclusive and collaborative. This attitude is critical. Maintaining our community costs about 30% of our total bandwidth - I think it’s worth it.
Fei: Regarding the open-source infrastructure software phenomenon, are there any companies that have inspired you?
Ed: I learned a lot from MongoDB. MongoDB did a great job cultivating relationships with developers and marketing its content. For example, MongoDB uses bloggers and advocates to make developers believe that MongoDB is very cool. Simultaneously, MongoDB has optimized its deployment experience and interface usability such that you don’t have to be a database expert to use MongoDB, which is a great move. And as you know, MongoDB is filing for an IPO, which is a great milestone. I feel very happy for them.
Scale-Up vs. Scale-Out Architecture & Cloud Adoption
Ed: Good question. Well, I’m betting on scale-out. If you want to build a scalable and highly available storage system, scale-out is the only way to go. On the other hand, the speed of data growth is faster than the speed of hardware innovation. But I think these two methodologies aren’t necessarily conflicting. In recent years, new hardware innovations have revolutionized the industry and there are a lot of new innovations in storage and network infrastructure. I think we should embrace both methodologies. Even though PingCAP chose scale-out architecture, we can also benefit from faster hardware.
Ed: Everyone thinks cloud is the biggest enemy of open-source, but I disagree. I think cloud is inevitable, so becoming a cloud-native database is part of PingCAP’s mission.
I’m not afraid of database solutions provided by public cloud vendors. I think that in the future, big companies won’t want to be locked in to one cloud provider. These companies will put their data on different cloud vendors, which means that the database should be independent and provide an unified interface across different cloud platforms. PingCAP has actually built partnerships with some of the biggest cloud vendors, including Tencent Cloud and UCloud, as their official cloud database solution.
As for cloud adaptors? I think it's 50-50. Most cloud adaptors are internet companies, which is understandable. I expect traditional companies to speed up their “cloudification” process as well – cloud solutions are the future.
Infrastructure Software in China vs. US
Fei: The US infrastructure software industry is far more advanced than its Chinese counterpart. What can China learn from the US?
Ed: I think the most important takeaway from the U.S. market is that open-source is the future. Most new-generation infrastructure software is open-sourced. It’s an effective way to let developers know your product as well as a modern development strategy.
Ed: I think it’s good news for us. “I” means IBM mainframe, “O” means Oracle database, and “E” means EMC high-end storage. “I” and “E” are not so hard to replace, and there are many mature drop-in replacements. But “O” is different because replacing databases is nearly impossible, especially considering compatibility issues and the cost of refactoring existing applications.
At the same time, you can’t deny that the Internet is transforming everything, including traditional industries. Some prominent examples include e-commerce, online payment, and SaaS. New businesses require new IT infrastructure, and they’re learning from Internet companies in that they no longer use Oracle systems. Therefore, our strategy is that we first try to become the infrastructure software provider for Internet companies, and after doing so, traditional enterprises will also adopt PingCAP. I don’t think a government mandate to promote certain technologies, like IOE, will achieve the best result; the product that will ultimately win is the product that meets the users’ needs with the highest quality.
Fei: How does the process of building an enterprise software startup differ between the U.S. and China? What are the differences in terms of willingness to pay?
Ed: The enterprise software market in the U.S is very mature. The pricing model, sales process, channels, and specifications are all well-established. You follow the rules, and everything is like solving math problems. At the same time, the U.S. is a very competitive market, and engineers have many choices and are happy to try new options. Paying for software is not weird. It's very hard for a company to become monopolistic.
In China, few companies are succeeding in enterprise software market. For a long time, people thought software should be free because compared to hardware, software is invisible – you can’t see it or touch it. People didn’t respect intellectual property. Furthermore, in the old days, the cost of labor in China was extremely low. The quality of the products made by external vendors and system integrators wasn’t consistent. Therefore, for a CIO or CTO, hiring engineers and building solutions in-house seemed like the best option. Selling software is also very relationship-oriented and channel-oriented, not product-oriented, which makes margins really low. This isn’t very profitable for businesses.
But things have changed. In recent years, the government has increased its protection of intellectual property, and people are starting to realize the value of software.
Fei: Compared with the U.S., what do you think of the open source software business in China？
Ed: What needs to be clarified, and many people misunderstand this, is that open source itself is not a business model. It’s moreso an effective way to promote, adopt, collaborate, and develop software. A product's commercialization potential depends on the value it delivers to its users. A simple example is that both Docker and MongoDB are open-source companies, but the ways by which they make money are totally different. We have a NewSQL database solution, so our business model will of course be different from a company with a NoSQL database solution. The solution determines the value of the product, and in my view, open source is a great way to promote and develop our technology.
Our users aren’t paying for the code – they’re buying a solution to fix their core relational database scalability problem. I believe the logic behind this business approach makes sense in both China and the U.S., except the U.S approach may be more standardized.
So in terms of our business, how much profit we will make depends on the following:
1. How big the market is for distributed databases – and we think the sky is the limit.
2. Controlling our sales cost, both pre-sale and post-sale. That’s why being Cloud-native – which drives down that cost – is a very important strategic direction for us.
To elaborate on the first factor, the market size of database products is massive – just look at Oracle’s market cap. MySQL is the most popular open-source database in the world. So the remaining issues are growth, channel and cost control. Our skilled technical team figured out a way to reduce the migration cost from MySQL to TiDB to almost 0. So our POC cost is very low. Most of our clients don’t even need us, they can install PingCAP’s products themselves, so our pipeline growth is very healthy. Open-source is viral in nature, which is how we’re able to cover China’s entire developer community with almost zero marketing expenditure.
In terms of keeping down the sales cost, we’ve already formed partnerships with several of China’s largest public cloud vendors as their cloud database solution. In fact, TiDB was recently launched on Tencent Cloud.
At the end of the day, our approach to business is simple: making money by solving problems, and doing so in a scalable way. That’s it. Nothing fancy.
The opinions expressed in this article are Ed Huang's own and do not reflect the views of PingCAP.