Paper submission and deadlines

Website:

KDD accepts only electronic submissions in PDF format at https://cmt.research.microsoft.com/KDD2014/.

Deadlines:

Abstracts must be electronically submitted by Thursday, February 13, 2014, 11:59 pm Pacific Standard Time.

Papers must be electronically submitted by Friday, February 21, 2014, 11:59 pm Pacific Standard Time.

Format:

All submissions must be in PDF format, follow the ACM Proceedings Template (Tighter Alternate style), and should not exceed 10 pages (all included). The maximum file size for submissions is 10 MB.

Warning! There will be absolutely no extensions of the submission deadline.

Heads up! You can upload an early version of your paper well before the deadline. If you start registering your paper only a few minutes before the deadline, you may not have enough time to fill in all the forms. Replacing an earlier version later is allowed, and does not take long, so we recommend to upload early and often.

The KDD evaluation criteria form the basis for acceptance decisions. If your paper is accepted, authors will have the opportunity to revise their papers in response to the reviewers before final submission for publication in the proceedings.

Acceptance notification will be sent on May 12, 2014.

Description of the Research track

We invite submission of papers describing innovative research on all aspects of knowledge discovery and data mining. Papers emphasizing theoretical foundations are particularly encouraged, as are novel modeling and algorithmic approaches to specific data mining problems in scientific, business, medical, and engineering applications. Visionary papers on new and emerging topics are also welcome. Authors are explicitly discouraged from submitting papers that contain only incremental results and that do not provide significant advances over existing approaches. Application oriented papers that make innovative technical contributions to research are also welcome.

Papers submitted to the Research track are solicited in all areas of data mining, knowledge discovery, and large-scale data analytics, including, but not limited to:

  • Algorithms: Graph and link mining, rule and pattern mining, web mining, dimensionality reduction and manifold learning, combinatorial optimization, relational and structured learning, matrix and tensor methods, classification and regression methods, semi-supervised learning, and unsupervised learning and clustering.
  • Applications: Innovative applications that use data mining, including systems for social network analysis, recommender systems, mining sequences, time series analysis, online advertising, bioinformatics, systems biology, text/web analysis, mining temporal and spatial data, and multimedia processing.
  • Big Data: Efficient and distributed data mining platforms and algorithms, systems for large-scale data analytics of textual and graph data, large-scale machine learning systems, distributed computing (cloud, map-reduce, MPI), large-scale optimization, and novel statistical techniques for big data.
  • Data mining for social good: Novel algorithms and applications of data mining to societal problems is especially encouraged. (For deployment of existing algorithms consider the Industry/Govt. track.) Topics include: public policy, sustainability, climate change, medicine and health, education, transportation, biodiversity and energy.
  • Foundations of data mining: Data mining methodology, data mining model selection, visualization, asymptotic analysis, information theory, and security and privacy.

Heads up! KDD is a dual track conference hosting both a Research track and an Industry & Government track. Due to the large number of submissions, papers submitted to the Research track will not be considered for publication in the Industry & Government track and vice-versa. Authors are encouraged to carefully read the track descriptions and choose an appropriate track for their submissions. To jump to the Industry & Government track, click here.

Evaluation and decision criteria

As per KDD tradition, reviews are not double-blind, and author names and affiliations should be listed.

Submitted papers will be assessed based on their novelty, technical quality, potential impact, and clarity. For papers that rely heavily on empirical evaluations, the experimental methods and results should be clear, well executed, and repeatable. Authors are strongly encouraged to make data and code publicly available whenever possible.

Papers will be reviewed by members of the KDD program committee and decisions will be emailed to all authors by May 12, 2014. Note that there will not be an author response phase between submission and decisions.

Formatting requirements

Papers are limited to 10 pages, including references, diagrams, and appendices, if any. The format is the standard double column ACM Proceedings Template, Tighter Alternate style.

Additional information about formatting and style files are available online at: http://www.acm.org/sigs/publications/proceedings-templates.

Warning! Papers that do not meet the formatting requirements will be rejected without review.

Subject areas

When you submit your paper to CMT you will be asked to select which terms from a pre-defined list of subjects could best be used to describe the content of your paper. The purpose of this is to help us in assigning reviewers to your paper. Reviewers will also indicate their expertise using the same set of subject area terms.

Heads up! The subject areas we are using this year differ from previous years, so please carefully look over the options. In addition, you are allowed to add free-text keywords describing your paper as you see fit.

Authors submitting a paper will be asked to select one primary subject area, and up to 5 secondary subject areas from the sets of terms below. The terms have been grouped to provide a somewhat systematic overview of topics relevant to the KDD conference.

A paper about information extraction from web pages using latent variables could select the combination:

Primary:
Text.
Secondary:
Topic, graphical and latent variable models; Web mining.

On the other hand, a paper about outlier detection in protein sequences could select the combination:

Primary:
Anomaly/novelty detection.
Secondary:
Bioinformatics; Sequence.

For reference, the list of subject areas that will appear to authors and reviewers in the CMT conference management system:

  1. Adaptive learning
    1. Active learning
    2. Adaptive experimentation
    3. Adaptive models
  2. Applications
    1. Mobile
    2. E-commerce
    3. Healthcare and medicine
    4. Science
    5. Finance
    6. Public policy
    7. Education
  3. Big data
    1. Distributed computing – cloud, map-reduce, MPI, others
    2. Scalable methods
    3. Large scale optimization
    4. Novel statistical techniques for big data
  4. Bioinformatics
  5. Causal discovery
  6. Data mining for social good
  7. Data streams
  8. Design of experiments and sample survey
  9. Dimensionality reduction
  10. Economy, markets
    1. Viral marketing
    2. Online advertising
  11. Feature selection
  12. Foundations
  13. Graph mining
  14. Information extraction
  15. Mining rich data types
    1. Temporal / time series
    2. Spatial
    3. Text
    4. Sequence
    5. Unstructured
  16. Nearest neighbors
  17. Other
  18. Probabilistic methods
  19. Recommender systems
    1. Collaborative filtering
    2. Content based methods
    3. Evaluation and metrics
    4. Cold-start
  20. Rule and pattern mining
  21. Sampling
  22. Security and privacy
    1. Anonymization
    2. Spam detection
    3. Intrusion detection
  23. Semi-supervised learning
    1. Learning with partial labels
    2. Anomaly/novelty detection
  24. Sentiment and opinion mining
  25. Social
    1. Social and information networks
    2. Community detection
    3. Link prediction
    4. Social media
  26. Supervised learning
    1. Classification
    2. Regression
    3. Learning to rank
    4. Multi-label
    5. Neural networks
    6. Boosting
    7. Decision trees
    8. Support vector machines
  27. Transfer learning
  28. Unsupervised learning
    1. Clustering
    2. Topic, graphical and latent variable models
    3. Matrix/tensor factorization
    4. Visualization
    5. Exploratory analysis
  29. User modeling
  30. Web mining

Dual submission policy

Submitted papers must describe work that is substantively different from work that has already been published or is currently under review for another conference. In particular, papers submitted to KDD should be substantively different to any papers submitted to another conference where the review and decision period of the other conference overlaps with that of KDD.

Accepted papers will be published in the conference proceedings by ACM and also appear in the ACM Digital Library.

The rights retained by authors who transfer copyright to ACM can be found here.

Paper submission and deadlines

Website:

KDD accepts only electronic submissions in PDF format at https://cmt.research.microsoft.com/KDD2014/.

Deadlines:

Abstracts must be electronically submitted by Thursday, February 13, 2014, 11:59 pm Pacific Standard Time.

Papers must be electronically submitted by Friday, February 21, 2014, 11:59 pm Pacific Standard Time.

Format:

All submissions must be in PDF format, follow the ACM Proceedings Template (Tighter Alternate style), and should not exceed 10 pages (all included). The maximum file size for submissions is 10 MB.

Warning! There will be absolutely no extensions of the submission deadline.

Heads up! You can upload an early version of your paper well before the deadline. If you start registering your paper only a few minutes before the deadline, you may not have enough time to fill in all the forms. Replacing an earlier version later is allowed, and does not take long, so we recommend to upload early and often.

The KDD evaluation criteria form the basis for acceptance decisions. If your paper is accepted, authors will have the opportunity to revise their papers in response to the reviewers before final submission for publication in the proceedings.

Acceptance notification will be sent on May 12, 2014.

Description of the Industry & Government track

We invite submissions describing implementations of data mining/analytics/big data/data science systems in industry, government, or non-profit settings. Our primary emphasis is on papers that advance the understanding of, and show how to deal with, practical issues related to deploying analytics technologies. This track also highlights new research challenges motivated by analytics and data mining applications in the real world. These applications can be in any field including, but not limited to e-commerce, medicine, healthcare, defense, public policy, engineering, law, manufacturing, telecommunications, and government. This year, we are highlighting a special theme at KDD, highlighting data science efforts for social good. We highly encourage submissions that are focused on that theme, and describe data science work being done in areas such as education, sustainability, healthcare, community development, and public safety.

Submitted papers will go through a competitive peer review process. The Industry & Government track is distinct from the Research Track in that submissions solve real-world problems and focus on systems that are deployed or are in the process of being deployed. Submissions must clearly identify one of the following three areas they fall into: “deployed”, “discovery”, or “emerging”.

The criteria for submissions in each category is as follows:

  • Deployed: Must describe deployment of a system that solves a non-trivial real-world problem. The focus should be on describing the problem, its significance, decisions and tradeoffs made when making design choices for the solution, deployment challenges, and lessons learned.
  • Discovery: Must include results that are discoveries with demonstrable value to an industry or government organization. This discovered knowledge must be “externally validated” as interesting and useful; it can not simply be a model that has better performance on some traditional evaluation metrics such as accuracy or area under the curve. A new scientific discovery enabled by the use of data mining techniques is an example of what this category will include.
  • Emerging: Submissions do not have to be deployed but must have clear applications to Industry/ Government to distinguish them from KDD research papers. They may also provide insight into issues and factors that affect the successful use and deployment of Data Mining and Analytics. Papers that describe enabling infrastructure for large-scale deployment of Data Mining and analytics techniques also fall in this category.

Heads up! KDD is a dual track conference hosting both a Research track and an Industry & Government track. Due to the large number of submissions, papers submitted to the Research track will not be considered for publication in the Industry & Government track and vice-versa. Authors are encouraged to carefully read the track descriptions and choose an appropriate track for their submissions. To jump to the Research track track, click here.

Evaluation and decision criteria

As per KDD tradition, reviews are not double-blind, and author names and affiliations should be listed.

Submitted papers will be assessed based on their novelty, technical quality, potential impact, and clarity. For papers that rely heavily on empirical evaluations, the experimental methods and results should be clear, well executed, and repeatable. Authors are strongly encouraged to make data and code publicly available whenever possible.

Papers will be reviewed by members of the KDD program committee and decisions will be emailed to all authors by May 12, 2014. Note that there will not be an author response phase between submission and decisions.

Formatting requirements

Papers are limited to 10 pages, including references, diagrams, and appendices, if any. The format is the standard double column ACM Proceedings Template, Tighter Alternate style.

Additional information about formatting and style files are available online at: http://www.acm.org/sigs/publications/proceedings-templates.

Warning! Papers that do not meet the formatting requirements will be rejected without review.

Dual submission policy

Submitted papers must describe work that is substantively different from work that has already been published or is currently under review for another conference/journal. In particular, papers submitted to KDD should be substantively different to any papers submitted to another conference/journal where the review and decision period of the other conference/journal overlaps with that of KDD.

Accepted papers will be published in the conference proceedings by ACM and also appear in the ACM Digital Library.

The rights retained by authors who transfer copyright to ACM can be found here.

The ACM KDD 2014 organizing committee would like to invite proposals for workshops to be held in conjunction with the conference.

Description

The goal of the workshops is to provide an informal forum to discuss important research questions and practical challenges in data mining and related areas.

Novel ideas, controversial issues, open problems and comparisons of competing approaches are strongly encouraged as workshop topics. Representation of alternative viewpoints and panel-style discussions are also particularly encouraged for all the workshops.

Possible workshop topics include all areas of data mining and knowledge discovery, machine learning, statistics, and data and information sciences, but are not limited to these. Interdisciplinary workshops with applications of data mining and data sciences to various disciplines (such as medicine, biology, sustainability, ecology, social sciences, humanities, or aerospace) are of high interest.

Format

Each of the workshops will run for a full day (6-8 hours) or for half a day (3.5-4 hours).

We would like to encourage organizers to avoid a mini-conference format by:

  1. Encouraging the submission of position papers and extended abstracts;
  2. Allowing plenty of time for discussions and debates;
  3. Organizing workshop panels.

Submission and deadlines

Email address:

The workshop proposals should be sent by email (text or any other standard format) to workshops@kdd2014.org

Email format:

The email should contain the following information:

  1. The NAMES and AFFILIATION of all the organizers;
  2. The MAIN CONTACT person (e-mail and telephone number);
  3. Proposed TITLE of the workshop;
  4. A maximum of three paragraphs that describe the TOPIC of the workshop, the target AUDIENCE, and RELEVANCE for SIGKDD;
  5. One paragraph MOTIVATING the workshop (why we should organize it NOW in conjunction with KDD 2014);
  6. Tentative names of invited speakers, reviewers, and panelists (if a panel will be organized);
  7. The desired LENGTH of the workshop (full day or half day).
Deadline:

Workshop proposats must be submitted via email by Friday, March 7, 2014, 11:59 pm Pacific Standard Time.

Notifications of decisions will be sent on March 31, 2014.

KDD-2014 will feature tutorials on topics of interests to the research community as well as industry practitioners. We invite proposals for tutorials from active researchers and experienced industry practitioners.

Description

We seek tutorials covering the state-of-the-art research, cutting-edge industry development and applications, and practical tools in a data mining direction that stimulate and facilitate future work. Tutorials on interdisciplinary directions, novel and fast growing directions, and significant applications are highly encouraged.

We solicit proposals of tutorials of two types: (1) lecture style short courses about data mining technical research and development, and (2) hands-on style for practical skills and tools. Each tutorial should be about 3 hours in length.

Submission and deadlines

A tutorial proposal should consists of the following sections:

  1. Title
  2. Abstract (up to 150 words)
  3. Target audience and prerequisites. Proposals must clearly identify the intended audience for the tutorial (e.g., novice users of statistical techniques, or expert researchers in text mining), and the background that is required of the audience. The proposal should describe why the topic is important/interesting to the KDD community and outline the benefit to participants.
  4. Outline of the tutorial. Enough material should be included to provide a sense of both the scope of material to be covered and the depth to which it will be covered. Please provide as much details as possible, it will help the KDD Tutorial co-chairs to select the tutorials best suited for KDD and its audience. Note that the tutors should NOT focus exclusively on their own research results. A KDD tutorial is not meant to be a forum for promoting one’s research or product.
  5. A list of forums and their time and locations if the tutorial or a similar/highly related tutorial has been presented by the same author(s) before, and highlight the similarity/difference between those and the one proposed for KDD’14 (up to 100 words for each entry)
  6. Tutors’ short bio and their expertise related to the tutorial (up to 200 words per tutor)
  7. A list of the most important references that will be covered in the tutorial
  8. Equipment and software requirement if there is any.
  9. (Optional) URLs of the slides/notes of the previous tutorials given by the authors, and any specific audio/video/computer requirements for the tutorial.

Proposals should be received by Saturday, March 15th. Please submit by email to tutorials@kdd2014.org with subject heading: “KDD14 Tutorial Proposal Submission” Acceptance decision will be made and communicated by April 15th.

We invite proposals for KDD Cup 2014 data mining competition under the broader theme of “Data Mining for the Social Good”. KDD Cup is the well-known data mining competition of the annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD­2014 conference will be held in NYC from August 24 ­ 27, 2014. The format of the CUP involves publishing a dataset via Kaggle and having the participants build the best possible predictive model on the task and submitting their predictions for evaluation. In The competition will last between 6 and 8 weeks and the winners should be notified by end of June. The winners will be announced in the KDD­2014 conference and we are planning to run a workshop as well that allows the participant to showcase their approaches.

Proposals should include a short paragraph on each of the following items:

  1. Description of the problem addressed, with general background information on the application domain.
  2. Description of the available data, guarantee of availability, guarantee of confidentiality of the "ground truth", and size.
  3. Description of the competition tasks, their social merit and significance. The notion of social is rather broad.
  4. Description of the evaluation procedures and established baselines. The evaluation metrics should be both meaningful for the application and statistically sound for objective performance comparison.
  5. Names, affiliations, postal addresses, phone numbers, and short biographies of the organizers.

A good competition task is one that is practically useful, scientifically or technically challenging, can be done without extensive application domain knowledge, and can be evaluated objectively. Of particular interests are non­traditional tasks/data that may need novel techniques and solutions and/or thoughtful feature construction. Of particular interest this year are proposals involving data and a problem from a field or discipline that if successfully executed, will result in a contribution of some lasting value to that field. You can assume that Kaggle will provide the technical support of running the contest. The data needs to be available no later than mid­March. While parts of the data can be public, the variable you are looking to predict may not. Datasets should have at least 10K instances (Millions are ok) and a sufficiently large/interesting feature set.

Please send your proposals to claudia.perlich@gmail.com by 2/22/2014. If you have initial questions about the suitability of your data/problem feel free to reach out to the above address.