Assignment data preprocessing – ufo sighting data exploration

Assignment data preprocessing

I had attached the ufo_sightings_large.csv

  • In this assignment, you will investigate UFO data over the last century to gain some insight.
  • Please use all the techniques we have learned in the class to preprocesss/clean the datasetufo_sightings_large.csv
  • After the dataset is preprocessed,  data preprocessing please split the dataset into training sets and test sets
  • Fit KNN to the training sets.
  • Print the score of KNN on the test sets

1. Import dataset “ufo_sightings_large.csv” in pandas (5 points)

2. Checking column types & Converting Column types (10 points)

Take a look at the UFO dataset’s column types using the dtypes attribute. Please convert the column types to the proper types. For example, the date column, which can be transformed into the datetime type. That will make our feature engineering efforts easier later on.

3. Dropping missing data (10 points)

Let’s remove some of the rows where certain columns have missing values.

4. Extracting numbers from strings (10 points)

The length_of_time column in the UFO dataset is a text field that has the number of minutes within the string. Here, you’ll extract that number from that text field using regular expressions.

In [ ]:


5. Identifying features for standardization (10 points)

In this section, you’ll investigate the variance of columns in the UFO dataset to determine which features should be standardized. You can log normlize the high variance column.

6. Encoding categorical variables (20 points)

There are couple of columns in the UFO dataset that need to be encoded before they can be modeled through scikit-learn. You’ll do that transformation here, using both binary and one-hot encoding methods.

7. Text vectorization (10 points)

Let’s transform the desc column in the UFO dataset into tf/idf vectors, since there’s likely something we can learn from this field.

8. Selecting the ideal dataset (10 points)

Let’s get rid of some of the unnecessary features.

9. Split the X and y using train_test_split, setting stratify = y (5 points)

In [9]:

X = ufo.drop(["type"],axis = 1)
y = ufo["type"].astype(str)

10. Fit knn to the training sets and print the score of knn on the test sets (5 points)

In [1]:

from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5)
# Fit knn to the training sets, train_y)
# Print the score of knn on the test sets
print(knn.score(test_X, test_y))

Calculate the price of your order

Select your paper details and see how much our professional writing services will cost.

We`ll send you the first draft for approval by at
Price: $36
  • Freebies
  • Format
  • Formatting (MLA, APA, Chicago, custom, etc.)
  • Title page & bibliography
  • 24/7 customer support
  • Amendments to your paper when they are needed
  • Chat with your writer
  • 275 word/double-spaced page
  • 12 point Arial/Times New Roman
  • Double, single, and custom spacing
  • We care about originality

    Our custom human-written papers from top essay writers are always free from plagiarism.

  • We protect your privacy

    Your data and payment info stay secured every time you get our help from an essay writer.

  • You control your money

    Your money is safe with us. If your plans change, you can get it sent back to your card.

How it works

  1. 1
    You give us the details
    Complete a brief order form to tell us what kind of paper you need.
  2. 2
    We find you a top writer
    One of the best experts in your discipline starts working on your essay.
  3. 3
    You get the paper done
    Enjoy writing that meets your demands and high academic standards!

Samples from our advanced writers

Check out some essay pieces from our best essay writers before your place an order. They will help you better understand what our service can do for you.

Get your own paper from top experts

Order now

Perks of our essay writing service

We offer more than just hand-crafted papers customized for you. Here are more of our greatest perks.

  • Swift delivery
    Our writing service can deliver your short and urgent papers in just 4 hours!
  • Professional touch
    We find you a pro writer who knows all the ins and outs of your subject.
  • Easy order placing/tracking
    Create a new order and check on its progress at any time in your dashboard.
  • Help with any kind of paper
    Need a PhD thesis, research project, or a two-page essay? For you, we can do it all.
  • Experts in 80+ subjects
    Our pro writers can help you with anything, from nursing to business studies.
  • Calculations and code
    We also do math, write code, and solve problems in 30+ STEM disciplines.

Frequently asked questions

Get instant answers to the questions that students ask most often.

See full FAQ
    See full FAQ

    Take your studies to the next level with our experienced specialists

    Hello, we are online 24/7. Leave a message we will reply instantly