Key Dates
- February 28, 2023 – Proposal submission deadline
- March 5, 2023 – Decision notification
- March 15, 2023 – Start of the competition
- June 20, 2023 – Notification to the KDD Cup Winners
- August 7, 2023 – Formal announcement of the KDD Cup Winners
Contact email: kdd23-cup-chairs@acm.org
Description
This Call for Proposals invites industrial or academic institutions and non-profit organizations to submit their proposals for organizing the 2023 KDD Cup competition. Since 1997, KDD Cup has been the premier annual Data Mining competition held in conjunction with the ACM SIGKDD conference on Knowledge Discovery and Data Mining. The KDD Cup competition is anticipated to last for 3 months, and the winners will be honored at the KDD conference opening ceremony and will present their solutions at the KDD Cup workshop during the conference. We are looking for strong proposals that meet the following requirements: a novel and motivated goal, an interesting challenge, a broad outreach for the data science community, a rigid and fair setup, a challenging yet manageable task, and accessibility to the KDD community:
- A novel and motivated problem: Of particular interest are tasks that address real-world problems, are novel to the data science community, and call for innovative solutions. We are particularly looking for problems that are different from typical machine learning challenges in recent years. Please include a specific section justifying how your proposed problem will encourage new research and push the field forward.
- A rigid and fair setup: The organizers should guarantee the availability of the data and the confidentiality of the test set (to prevent information leakage). The evaluation metrics should be both meaningful for the application at-hand and statistically sound for the objective comparison. The baseline should be established to show that non-trivial results can be achieved. An estimate of what constitutes a significant difference in the performance will be much appreciated. We are interested in evaluation that can speak to how ML performance aligns with real world impact. We would be particularly interested in how the winning solutions could be further evaluated in more realistic settings, and how the success of these approaches in practice could be discussed.
- A challenging yet manageable task: The task should be challenging in the sense that there is enough room for improvement from the basic solutions, and novel ideas are required to succeed in the competition. The task should be manageable in about 3 months’ time.
- Accessibility: The notions presented in the description should make the competition accessible to the majority of machine learning and data mining practitioners who might not have significant prior domain knowledge or access to a large amount of computational infrastructure. The proposal should discuss how domain expertise can be factored in or any simplifications made to decrease the need for domain expertise.
- Proposal details: Proposals should cover all the important details such as dates, submission and evaluation of results, and describe the competition rules clearly. As a rule of thumb, prepare a proposal as close as possible to the version you would publish on the competition’s webpage.
Requirements
Please follow the following template for your proposal submission:
- Problem description: Describe the problem. Justify why this is an important and novel problem. In particular, please elaborate how your proposed problem is different from the previous competitions in recent years. Additionally, please include a discussion of the broader impact of this problem. Please prepare some data samples or scenarios of your proposed problem. If you plan to include more than one track, please describe the unique value for each track.
- Evaluation: Describe how you plan to evaluate the submission. We encourage you to think about how the evaluation aligns with real-world impact. We are particularly interested if additional evaluation on the winning submissions can be conducted in the real world after the competition.
- Timeline: Start of the competition (website setup, datasets release, leaderboard setup), user registration deadline, submission deadline, and notification. You can consider two rounds of submissions if suitable.
- Awards: We encourage you to think about awards beyond the money prize.
- Implementation Details:
- Competition infrastructure. Which competition infrastructure do you plan to use (e.g., Kaggle, or your own)? Is the competition platform you chose equally accessible to participants all over the world?
- Team work. Explain how the host will organize a team dedicated to the KDD Cup. For each team member, please include a list of their roles, responsibilities, and their commitment.
- Are there any privacy concerns for the released data? Have you obtained the rights to release the data for the competition from your legal counsels? What type of report, presentation, code do you require to submit for the final winning solutions?
- How would you handle Q&A and possible revisions during the competition?
- To which extent you have explored this problem and what is the baseline solution?
- How do you plan to promote the competition?
- What is your plan to enhance diversity of the participants and encourage participation of a diverse group of competitors?
- Host information:
- Names, affiliations, email addresses, phone numbers, and short biographies of the organizers.
Selection priority will be given to innovative datasets and problems, and proposals with a specific plan to promote a diversity of participants.
Please keep the proposal concise and strictly confidential. Please send your proposals in the PDF format to kdd23-cup-chairs@acm.org by the submission deadline, and follow the updates provided on the KDD website.
Thank you for your participation.
– KDD Cup 2023 Chairs