English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
This article shares the source code of a decision tree written in Python for everyone's reference, the specific content is as follows
Because of the need for recent internship, I have rewritten the decision tree using the sklearn package in Python.
Tools:sklearnConvert the dot file to pdf format (to visualize the decision tree) graphviz-2.38After downloading and extracting, add the directory of the bin files to the environment variable
The source code is as follows:
from sklearn.feature_extraction import DictVectorizer import csv from sklearn import tree from sklearn import preprocessing from sklearn.externals.six import StringIO from xml.sax.handler import feature_external_ges from numpy.distutils.fcompiler import dummy_fortran_file # Read in the csv file and put features into list of dict and list of class label allElectronicsData = open(r'E:/DeepLearning/resources/AllElectronics.csv', 'rt') reader = csv.reader(allElectronicsData) headers = next(reader) featureList = [] labelList = [] for row in reader: labelList.append(row[len(row)]-1]) rowDict = {} # Exclude len(row)-1 for i in range(1, len(row)-1) : rowDict[headers[i]] = row[i] featureList.append(rowDict) print(featureList) vec = DictVectorizer() dummX = vec.fit_transform(featureList).toarray() print(str(dummX)) lb = preprocessing.LabelBinarizer() dummY = lb.fit_transform(lableList) print(str(dummY)) #entropy=>ID3 clf = tree.DecisionTreeClassifier(criterion='entropy') clf = clf.fit(dummX, dummY) print("clf:")+str(clf) #Visualize tree with open("resultTree.dot",'w')as f: f = tree.export_graphviz(clf, feature_names=vec.get_feature_names(), out_file=f) #How to view the classification of new data oneRowX = dummX[0,:] print("oneRowX:")+str(oneRowX) newRowX = oneRowX newRowX[0] = 1 newRowX[2] = 0 predictedY = clf.predict(newRowX) print("predictedY:")+ str(predictedY)
The AllElectronics.csv here is shown in the format as follows:
Early this morning, I finally managed to install jdk, eclipse, and pydev on linux, but, but, but, when I tried to install numpy, I always got an error, and I found out that it was because I didn't have gcc, so I went to install gcc, which was really frustrating. Now gcc still hasn't been installed successfully, and I need to think of another way
Statement: The content of this article is from the Internet, and the copyright belongs to the original author. The content is contributed and uploaded by Internet users spontaneously. This website does not own the copyright, has not been manually edited, and does not assume any relevant legal liability. If you find any content suspected of copyright infringement, please send an email to: notice#w3Please replace '#' with '@' when sending an email for reporting, and provide relevant evidence. Once verified, this site will immediately delete the infringing content.