Title of Project
Group Member 1, Group Member 2, Group Member 3
Abstract
The abstract should be one paragraph that summarizes what you will do for your project.
Introduction
Provide a brief overview of data mining. Describe what your proposal is about and the organization of the rest of the proposal. Include whether you will be performing data mining tasks, implementing a new algorithm in Weka, or modifying some other system to incorporate data mining features, etc. Basically, provide the nature of your project. This section should be a page or less in length.
Data Mining Task
Provide the specific tasks you will perform on the data set. Include specific questions you will investigate, and the goals for the tasks. This should be independent of the specific techniques you will use to achieve your goals. This section should be a page or less.
Data Set
Describe the data set(s) you will be using in your project. Include the origin of the data set, an overview of the data set organization, attributes of the data, and challenges of the data set you've selected. Include any information you have about missing values in the data set. This should be one to two pages in length.
Methods and Models
Describe in detail the data mining methods and models you plan to employ to achieve the goals you set in the Data Mining Task section of your document. Include some mention of necessary data transformation. If you're implementing a technique, you should have some idea of how it will be implemented and incorporated into Weka or some other system. If you are combining techniques, explain how you intend to use the output of one technique as input into another technique. This section should be up to 5 pages in length. Remember, be detailed, include how you will select the best model from the model space, etc.
Assessment
Discuss the assessment methodology you will use to validate that you have found meaningful patterns. Will you use n-fold crossvalidation, confidence intervals for accuracy, etc. How will you create your training and test sets? What baseline models will you use? This section should be about a page or two in length.
Presentation and Visualization
Describe how your results will be presented and visualized in such a way to show meaningful patterns in the data. This should be up to a page in length.
Roles
In this section, discuss the roles that each group member will have in the project. One paragraph per group member is sufficient.
Schedule
The schedule is a table of dates and tasks that you plan to complete by those dates. Tasks to be done by the progress report must be listed, as well as any other dates you want to set for yourselves. Additional deadlines are highly recommended. Be sure to include when you will have data transformation, modeling, assessment, visualization, etc. completed.
| Date | Tasks to be Completed | |
|---|---|---|
| 10/30/02 | Tasks completed by chosen date | |
| 11/13/01 | Tasks to be completed by the progress report date | |
| 12/04/01 | Tasks completed by the class presentation |
Bibliography
This is where you list bibliographic information for any references you made throughout the proposal. You should have lots of references.