English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

Python Code for Decision Tree Using sklearn Package

This article shares the source code of a decision tree written in Python for everyone's reference, the specific content is as follows

Because of the need for recent internship, I have rewritten the decision tree using the sklearn package in Python.

Tools:sklearnConvert the dot file to pdf format (to visualize the decision tree) graphviz-2.38After downloading and extracting, add the directory of the bin files to the environment variable

The source code is as follows:

from sklearn.feature_extraction import DictVectorizer
import csv
from sklearn import tree
from sklearn import preprocessing
from sklearn.externals.six import StringIO
from xml.sax.handler import feature_external_ges
from numpy.distutils.fcompiler import dummy_fortran_file
# Read in the csv file and put features into list of dict and list of class label
allElectronicsData = open(r'E:/DeepLearning/resources/AllElectronics.csv', 'rt')
reader = csv.reader(allElectronicsData)
headers = next(reader)
featureList = []
labelList = []
for row in reader:
labelList.append(row[len(row)]-1])
rowDict = {}
# Exclude len(row)-1
for i in range(1, len(row)-1) :
rowDict[headers[i]] = row[i]
featureList.append(rowDict)
print(featureList)
vec = DictVectorizer()
dummX = vec.fit_transform(featureList).toarray()
print(str(dummX))
lb = preprocessing.LabelBinarizer()
dummY = lb.fit_transform(lableList)
print(str(dummY))
#entropy=>ID3
clf = tree.DecisionTreeClassifier(criterion='entropy')
clf = clf.fit(dummX, dummY)
print("clf:")+str(clf)
#Visualize tree
with open("resultTree.dot",'w')as f:
f = tree.export_graphviz(clf, feature_names=vec.get_feature_names(), out_file=f)
#How to view the classification of new data
oneRowX = dummX[0,:]
print("oneRowX:")+str(oneRowX)
newRowX = oneRowX
newRowX[0] = 1
newRowX[2] = 0
predictedY = clf.predict(newRowX)
print("predictedY:")+ str(predictedY)

The AllElectronics.csv here is shown in the format as follows:

Early this morning, I finally managed to install jdk, eclipse, and pydev on linux, but, but, but, when I tried to install numpy, I always got an error, and I found out that it was because I didn't have gcc, so I went to install gcc, which was really frustrating. Now gcc still hasn't been installed successfully, and I need to think of another way

Statement: The content of this article is from the Internet, and the copyright belongs to the original author. The content is contributed and uploaded by Internet users spontaneously. This website does not own the copyright, has not been manually edited, and does not assume any relevant legal liability. If you find any content suspected of copyright infringement, please send an email to: notice#w3Please replace '#' with '@' when sending an email for reporting, and provide relevant evidence. Once verified, this site will immediately delete the infringing content.

You May Also Like