Synerise took third place, behind Baidu and Deep Mind, in the KDD Cup, the most significant competition devoted to AI/ML.
About KDD Cup
The KDD Cup was held during the KDD conference organized by ACM and is unofficially known as the AI World Championship. Since 1989, the KDD conference has been the world’s oldest and most significant data mining event. Innovations such as crowdsourcing, large-scale data science competitions, algorithms for personalizing ads (e.g. Google), data mining (e.g. Facebook, LinkedIn), and recommendation systems (e.g. Netflix, Amazon, etc.) have come mainly from KDD.
In 2020, the conference attracted over 3,900 researchers from both the commercial and university worlds. KDD participants come from the largest technology companies globally. Such as Google, Alibaba, Facebook, Netflix, LinkedIn, Tencent, Microsoft, IBM, Spotify, and Amazon. Equally crucial to the KDD community was the voice of state institutions such as NIH, NSF, or DARPA.
This year, nearly 2,500 teams worldwide competed in three KDD Cup competition categories. With three winners of each category awarded. Synerise competed in the most difficult of them, organized by Stanford University, Facebook AI, Google, and Intel.
“With our work, we want to prove that our AI team can compete with innovation leaders. We have created one of the most accurate and fastest systems. The processing time of the test set using the Synerise model is about 7 minutes. At the same time, the Google DeepMind solution takes as much as 12 hours,” said Michał Daniluk, AI Research Scientist at Synerise.
The competition task
It was to predict the subject of scientific publications based on the edges of the heterogeneous graph of studies, citations, authors, and scientific institutions. The graph of unprecedented size (about 250 GB) contained over 244B vertices of the three types. And it connected by as many as 1.7B edges. It allowed the algorithms to be verified in their readiness to operate on very large-scale data.
“Large heterogeneous graphs appear in many practical applications. The graph we process as part of the KDD Cup concerns academic citations. Still, data with a similar structure is also present everywhere. In e-commerce (customer transaction graphs), large knowledge bases and document databases. Therefore, the mastery in processing this type of data leads to a tangible business advantage. Espacially in improving the quality of recommendations and data retrieval. I am glad that data on these types of practical problems increasingly appear in competitions at leading conferences,” said Barbara Rychalska, AI Research Scientist at Synerise.
The Polish team
Synerise team consisting of Jacek Dąbrowski, Michał Daniluk, Barbara Rychalska and Konrad Gołuchowski used proprietaryML methods Cleora and EMDE. Unlike most teams that improved the existing algorithms. The methods developed by the Synerise team previously allowed for victories in the SIGIR Rakuten Data Challenge 2020 and WSDM Booking.com Data Challenge 2021 competitions. They are also a vital element of the personalization system available to Synerise customers. The solution of the Polish team has already been published on the Stanford University website.
The most technologically advanced companies and universities globally attended the competition. Synerise defeated teams from all over the world, including specialists from Intel, OPPO Research Topology Lab and Huazhong University.
“At Synerise, we focus on a fundamental understanding of the mathematical phenomena underlying deep learning. Combined with engineering finesse, it allows us to compete with the best research centers in the world. Even though we only have a fraction of the resources available to them,” said Jacek Dąbrowski from Synerise.
The company offers a Big Data and AI platform. The latest technological solutions allow real-time processing data from various sources. It is based on proprietary database systems and AI algorithms. And automated execution of business scenarios for retail, banking, telecommunications, or e-commerce. Synerise’s customers include CCC, Carrefour, Żabka, Orange, mBank, Sharaf DG.
Read more about succesess of Poland in Intel AI Global Festival 2021.
And about AI-driven solution from Addepto supporting the Madison Square Garden.