Large scaling database-Machine Learning

// If you needed, attached is the learning resources.
You are to use collaborative filtering techniques to predict which political party voters who have not been polled will vote for in an upcoming election. We assume that we have a large data store of voters and many attributes about the voters. Attributes include age_group (with values of young, middle, old), gender, income_bracket (with values of under_50K, 50_150K, 150_300K, over_300K), marital_status, number_of_children, profession (with many different values), education_level (with values of no_high_school, high_school, bachelors, masters, doctor), number_of_automobiles, political_party, and state. Also assume that many voters have already been polled and the party they stated that they would vote for is also stored in the data store.

Design a schema for a structured cloud table such as Accumulo to represent this data.

Write pseudocode for determining similarity called VoterSimilarity() with signature:

UserSimilarity similarity =
VoterSimilarity(voterA, voterB);

Assume that facts such as gender or state with totally different values either have a similarity value of 0 or 1. Assume that attributes that have values over a spectrum, such as age_group or education_level, have a value of 1 for an exact match; 0 for one end of the spectrum to the other such as young to old or no_high_school to doctor; or the fraction of difference for other measurements such as no_high_school to high_school is ¾, no_high_school to bachelors is ½, high_school to doctor is ¼, etc. To determine overall similarity between two voters, just add up the similarity scores for each of the attributes for each of the two voters and compare their overall scores.

Given a voter who has not been polled, write pseudocode to find all nearest neighbors of voters who have been polled that pass a certain threshold. The signature is:

Neighborhood neighborhood =
nearestNeighbors (threshold,
voterA /* not polled */,
allVoters /* all voters */);

Assume that you have access to the VoterSimilarity() method from the previous section.

Write pseudocode for selecting the how a voter will vote. A simple metric is to determine if the majority of the neighborhood were polled Democrat or Republican.

Vote vote = PredictVote (voterA, neighborhood);

The post Large scaling database-Machine Learning appeared first on Learnedprofessors.

GET HELP WITH YOUR HOMEWORK PAPERS @ 25% OFF

For faster services, inquiry about  new assignments submission or  follow ups on your assignments please text us/call us on +1 (251) 265-5102

Write My Paper Button

WeCreativez WhatsApp Support
We are here to answer your questions. Ask us anything!
👋 Hi, how can I help?
Scroll to Top