Support vector machine
Posts  1 - 2  of  2
richajas84
Hi.
this is Riddhi Sharma.. working on svm.
i need to knw how SVM can be used to classify urls..
My requirement is to find out wether a given url is a home page or not using svm.
so.. i ve a set of urls wid me.. i need to classify them as homepage or not a homepage..
plz. help me in getting d way out..

Save
Cancel
Reply
replied to:  richajas84
Elroch
Replied to:  Hi. this is Riddhi Sharma.. working on svm. i need to...
First you need to choose lots of features for your input data set. Some of these features may be categorical (only able to take values in a (small) discrete set, others may be numerical. For example, having a ROBOTS value of "NO INDEX" is a categorical variable, but the byte count of the HTML would be best viewed as a numerical feature.

Once you have decided on every feature you think might have some relevance, you need a good sample of homepages and non-home pages. The output data will be 1 for homepages and 0 for non-homepages. Now you have a multidimensional input data set of feature values and output data and you are ready to use SVM software to generate a model for your data. Once you have a model you will want to test it on some different pages (it is important that they have not been used in the train the SVM) to see how accurate it is at predicting whether a page is a home page or not.
Save
Cancel
Reply
 
x
OK