手写数字识别

百般思念 提交于 2020-03-10 06:05:21

数据准备

由于自己没有完整的手写数字集,故采用MNIST手写数字字符库进行训练和识别,这里给出MNIST字符库官方网站:
http://yann.lecun.com/exdb/mnist/
由于是国外的网站,可能会遇到进不去的情况,又是进去也会一直转圈,下载不到东西。不过没关系啦,小编已经给大家准备好的数据集,具体获取方法见文章末尾。

书写数字显示

图片数据导入

读取图片数据

这里选取6000张图像(0到9每个数字600张)来训练模型

train_fileName='D:\Desktop\matlab_code\machine_learning\handwritting_recognize\train_images\'; 
train_Files = dir(strcat(train_fileName,'*.bmp')); 
LengthFiles = length(train_Files); 
train_img_arr=[]; 
for i = 1:LengthFiles 
    srcimg = imread(strcat(train_fileName,train_Files(i).name)); 
    img_arr = reshape(srcimg, 1, numel(srcimg)); %图像展开为一行 
    img_arr=double(img_arr);
    train_img_arr=[train_img_arr;img_arr]; 
end

建立数字标签

建立数字标签生成一个1到10的矩阵

 y= [0, 1:9];

重新建立一个矩阵,以ylabel为基础单位,建立一个大小为600×1600\times 1,即新矩阵A大小为600×10600\times 10

A=repmat(y,600,1);

将A转换为一列

label = reshape(A,6000,1);

随机显示(visualize)六张手写数字

figure(1)
for ii = 1:6
    subplot(2,3,ii)
    rand_num = randperm(6000,1);%randperm(8,4)意思是从18之间的整数中随机输出四个整数
    image(reshape(train_img_arr(rand_num,:),28,28))
    title((label(rand_num)),'FontSize',20)%这里还有问题,需要完善,改正
    axis off
end
colormap gray  %将颜色设置为二值颜色,即白色慢慢变黑

手写数字随机显示

训练模型

训练数据的划分

这里我们需要把训练数据做一个预处理,因为接下来训练模型选用的是matlab自带的 classification learner app,输入的训练数据需要和其对应的标签一起输入,因此,对原始的训练数据后再加一列标签栏。由于classification learner app 要求输入的数据类型为table,对矩阵列表使用array2table()函数将其转换为table型。

train_img=[train_img_arr,label];
train=array2table(train_img);

使用classification learner app进行数据训练

在matlab菜单栏App中找到classification learner 或者直接在命令窗口输入

Classification Learner

打开。界面如下:
在这里插入图片描述
点击新建会话,选择从工作区导入数据。然后在弹出的窗口里找到工作区变量,点击 input ,选择train,如下图:
![在这里插入图片描述](https://img-blog.csdnimg.cn/20200308213835201.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzQxOTIxNzA5,size_16,color_FFFFFF,t_70
上图中有几点设计到算法,例如交叉验证、留出法验证,具体的介绍将在以后的博文中详细介绍。然后点击右下角的开始会话,出现如下图:
在这里插入图片描述
由于我这篇博文的目的是教会大多数使用matlab进行机器学习的初学者快速入门,因此,就简化算法的说明和描述,但要说明的是,算法对每一个项目是至关重要的
然后,我们在上面视图菜单栏中找到模型类型,点击右边朝下的黑色小箭头,然后选择全部,再点击使用并行,再点击右边的训练,出现下图:在这里插入图片描述
上图显示模型正在训练。训练结束后,我们发现三次SVM 模型的准确度最好,达到了94.2%,查看其混淆矩阵:
在这里插入图片描述
在这里插入图片描述
然后我们点击导出,分别导出训练好的模型和生成相应的函数:
导出的模型为一个结构体,
在这里插入图片描述
包括训练函数 ClassificationSVM ,和预测函数predictFcn。
生成的函数如下:

function [trainedClassifier, validationAccuracy] = trainClassifier(trainingData)
% [trainedClassifier, validationAccuracy] = trainClassifier(trainingData)
% 返回经过训练的分类器及其准确度。以下代码重新创建在 Classification Learner App 中训
% 练的分类模型。您可以使用该生成的代码基于新数据自动训练同一模型,或通过它了解如何以程序化方
% 式训练模型。
%
%  输入:
%      trainingData: 一个所含预测变量和响应列与导入 App 中的相同的表。
%
%  输出:
%      trainedClassifier: 一个包含训练的分类器的结构体。该结构体中具有各种关于所训练分
%       类器的信息的字段。
%
%      trainedClassifier.predictFcn: 一个对新数据进行预测的函数。
%
%      validationAccuracy: 一个包含准确度百分比的双精度值。在 App 中,"历史记录"%       表显示每个模型的此总体准确度分数。
%
% 使用该代码基于新数据来训练模型。要重新训练分类器,请使用原始数据或新数据作为输入参数
% trainingData 从命令行调用该函数。
%
% 例如,要重新训练基于原始数据集 T 训练的分类器,请输入:
%   [trainedClassifier, validationAccuracy] = trainClassifier(T)
%
% 要使用返回的 "trainedClassifier" 对新数据 T2 进行预测,请使用
%   yfit = trainedClassifier.predictFcn(T2)
%
% T2 必须是一个表,其中至少包含与训练期间使用的预测变量列相同的预测变量列。有关详细信息,请
% 输入:
%   trainedClassifier.HowToPredict

% 由 MATLAB 于 2020-03-06 11:11:26 自动生成


% 提取预测变量和响应
% 以下代码将数据处理为合适的形状以训练模型。
%
inputTable = trainingData;
predictorNames = {'train1', 'train2', 'train3', 'train4', 'train5', 'train6', 'train7', 'train8', 'train9', 'train10', 'train11', 'train12', 'train13', 'train14', 'train15', 'train16', 'train17', 'train18', 'train19', 'train20', 'train21', 'train22', 'train23', 'train24', 'train25', 'train26', 'train27', 'train28', 'train29', 'train30', 'train31', 'train32', 'train33', 'train34', 'train35', 'train36', 'train37', 'train38', 'train39', 'train40', 'train41', 'train42', 'train43', 'train44', 'train45', 'train46', 'train47', 'train48', 'train49', 'train50', 'train51', 'train52', 'train53', 'train54', 'train55', 'train56', 'train57', 'train58', 'train59', 'train60', 'train61', 'train62', 'train63', 'train64', 'train65', 'train66', 'train67', 'train68', 'train69', 'train70', 'train71', 'train72', 'train73', 'train74', 'train75', 'train76', 'train77', 'train78', 'train79', 'train80', 'train81', 'train82', 'train83', 'train84', 'train85', 'train86', 'train87', 'train88', 'train89', 'train90', 'train91', 'train92', 'train93', 'train94', 'train95', 'train96', 'train97', 'train98', 'train99', 'train100', 'train101', 'train102', 'train103', 'train104', 'train105', 'train106', 'train107', 'train108', 'train109', 'train110', 'train111', 'train112', 'train113', 'train114', 'train115', 'train116', 'train117', 'train118', 'train119', 'train120', 'train121', 'train122', 'train123', 'train124', 'train125', 'train126', 'train127', 'train128', 'train129', 'train130', 'train131', 'train132', 'train133', 'train134', 'train135', 'train136', 'train137', 'train138', 'train139', 'train140', 'train141', 'train142', 'train143', 'train144', 'train145', 'train146', 'train147', 'train148', 'train149', 'train150', 'train151', 'train152', 'train153', 'train154', 'train155', 'train156', 'train157', 'train158', 'train159', 'train160', 'train161', 'train162', 'train163', 'train164', 'train165', 'train166', 'train167', 'train168', 'train169', 'train170', 'train171', 'train172', 'train173', 'train174', 'train175', 'train176', 'train177', 'train178', 'train179', 'train180', 'train181', 'train182', 'train183', 'train184', 'train185', 'train186', 'train187', 'train188', 'train189', 'train190', 'train191', 'train192', 'train193', 'train194', 'train195', 'train196', 'train197', 'train198', 'train199', 'train200', 'train201', 'train202', 'train203', 'train204', 'train205', 'train206', 'train207', 'train208', 'train209', 'train210', 'train211', 'train212', 'train213', 'train214', 'train215', 'train216', 'train217', 'train218', 'train219', 'train220', 'train221', 'train222', 'train223', 'train224', 'train225', 'train226', 'train227', 'train228', 'train229', 'train230', 'train231', 'train232', 'train233', 'train234', 'train235', 'train236', 'train237', 'train238', 'train239', 'train240', 'train241', 'train242', 'train243', 'train244', 'train245', 'train246', 'train247', 'train248', 'train249', 'train250', 'train251', 'train252', 'train253', 'train254', 'train255', 'train256'};
predictors = inputTable(:, predictorNames);
response = inputTable.train257;
isCategoricalPredictor = [false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false];

% 训练分类器
% 以下代码指定所有分类器选项并训练分类器。
template = templateSVM(...
    'KernelFunction', 'polynomial', ...
    'PolynomialOrder', 3, ...
    'KernelScale', 'auto', ...
    'BoxConstraint', 1, ...
    'Standardize', true);
classificationSVM = fitcecoc(...
    predictors, ...
    response, ...
    'Learners', template, ...
    'Coding', 'onevsone', ...
    'ClassNames', [0; 1; 2; 3; 4; 5; 6; 7; 8; 9]);

% 使用预测函数创建结果结构体
predictorExtractionFcn = @(t) t(:, predictorNames);
svmPredictFcn = @(x) predict(classificationSVM, x);
trainedClassifier.predictFcn = @(x) svmPredictFcn(predictorExtractionFcn(x));

% 向结果结构体中添加字段
trainedClassifier.RequiredVariables = {'train1', 'train10', 'train100', 'train101', 'train102', 'train103', 'train104', 'train105', 'train106', 'train107', 'train108', 'train109', 'train11', 'train110', 'train111', 'train112', 'train113', 'train114', 'train115', 'train116', 'train117', 'train118', 'train119', 'train12', 'train120', 'train121', 'train122', 'train123', 'train124', 'train125', 'train126', 'train127', 'train128', 'train129', 'train13', 'train130', 'train131', 'train132', 'train133', 'train134', 'train135', 'train136', 'train137', 'train138', 'train139', 'train14', 'train140', 'train141', 'train142', 'train143', 'train144', 'train145', 'train146', 'train147', 'train148', 'train149', 'train15', 'train150', 'train151', 'train152', 'train153', 'train154', 'train155', 'train156', 'train157', 'train158', 'train159', 'train16', 'train160', 'train161', 'train162', 'train163', 'train164', 'train165', 'train166', 'train167', 'train168', 'train169', 'train17', 'train170', 'train171', 'train172', 'train173', 'train174', 'train175', 'train176', 'train177', 'train178', 'train179', 'train18', 'train180', 'train181', 'train182', 'train183', 'train184', 'train185', 'train186', 'train187', 'train188', 'train189', 'train19', 'train190', 'train191', 'train192', 'train193', 'train194', 'train195', 'train196', 'train197', 'train198', 'train199', 'train2', 'train20', 'train200', 'train201', 'train202', 'train203', 'train204', 'train205', 'train206', 'train207', 'train208', 'train209', 'train21', 'train210', 'train211', 'train212', 'train213', 'train214', 'train215', 'train216', 'train217', 'train218', 'train219', 'train22', 'train220', 'train221', 'train222', 'train223', 'train224', 'train225', 'train226', 'train227', 'train228', 'train229', 'train23', 'train230', 'train231', 'train232', 'train233', 'train234', 'train235', 'train236', 'train237', 'train238', 'train239', 'train24', 'train240', 'train241', 'train242', 'train243', 'train244', 'train245', 'train246', 'train247', 'train248', 'train249', 'train25', 'train250', 'train251', 'train252', 'train253', 'train254', 'train255', 'train256', 'train26', 'train27', 'train28', 'train29', 'train3', 'train30', 'train31', 'train32', 'train33', 'train34', 'train35', 'train36', 'train37', 'train38', 'train39', 'train4', 'train40', 'train41', 'train42', 'train43', 'train44', 'train45', 'train46', 'train47', 'train48', 'train49', 'train5', 'train50', 'train51', 'train52', 'train53', 'train54', 'train55', 'train56', 'train57', 'train58', 'train59', 'train6', 'train60', 'train61', 'train62', 'train63', 'train64', 'train65', 'train66', 'train67', 'train68', 'train69', 'train7', 'train70', 'train71', 'train72', 'train73', 'train74', 'train75', 'train76', 'train77', 'train78', 'train79', 'train8', 'train80', 'train81', 'train82', 'train83', 'train84', 'train85', 'train86', 'train87', 'train88', 'train89', 'train9', 'train90', 'train91', 'train92', 'train93', 'train94', 'train95', 'train96', 'train97', 'train98', 'train99'};
trainedClassifier.ClassificationSVM = classificationSVM;
trainedClassifier.About = '此结构体是从 Classification Learner R2019b 导出的训练模型。';
trainedClassifier.HowToPredict = sprintf('要对新表 T 进行预测,请使用: \n yfit = c.predictFcn(T) \n将 ''c'' 替换为作为此结构体的变量的名称,例如 ''trainedModel''。\n \n表 T 必须包含由以下内容返回的变量: \n c.RequiredVariables \n变量格式(例如矩阵/向量、数据类型)必须与原始训练数据匹配。\n忽略其他变量。\n \n有关详细信息,请参阅 <a href="matlab:helpview(fullfile(docroot, ''stats'', ''stats.map''), ''appclassification_exportmodeltoworkspace'')">How to predict using an exported model</a>。');

% 提取预测变量和响应
% 以下代码将数据处理为合适的形状以训练模型。
%
inputTable = trainingData;
predictorNames = {'train1', 'train2', 'train3', 'train4', 'train5', 'train6', 'train7', 'train8', 'train9', 'train10', 'train11', 'train12', 'train13', 'train14', 'train15', 'train16', 'train17', 'train18', 'train19', 'train20', 'train21', 'train22', 'train23', 'train24', 'train25', 'train26', 'train27', 'train28', 'train29', 'train30', 'train31', 'train32', 'train33', 'train34', 'train35', 'train36', 'train37', 'train38', 'train39', 'train40', 'train41', 'train42', 'train43', 'train44', 'train45', 'train46', 'train47', 'train48', 'train49', 'train50', 'train51', 'train52', 'train53', 'train54', 'train55', 'train56', 'train57', 'train58', 'train59', 'train60', 'train61', 'train62', 'train63', 'train64', 'train65', 'train66', 'train67', 'train68', 'train69', 'train70', 'train71', 'train72', 'train73', 'train74', 'train75', 'train76', 'train77', 'train78', 'train79', 'train80', 'train81', 'train82', 'train83', 'train84', 'train85', 'train86', 'train87', 'train88', 'train89', 'train90', 'train91', 'train92', 'train93', 'train94', 'train95', 'train96', 'train97', 'train98', 'train99', 'train100', 'train101', 'train102', 'train103', 'train104', 'train105', 'train106', 'train107', 'train108', 'train109', 'train110', 'train111', 'train112', 'train113', 'train114', 'train115', 'train116', 'train117', 'train118', 'train119', 'train120', 'train121', 'train122', 'train123', 'train124', 'train125', 'train126', 'train127', 'train128', 'train129', 'train130', 'train131', 'train132', 'train133', 'train134', 'train135', 'train136', 'train137', 'train138', 'train139', 'train140', 'train141', 'train142', 'train143', 'train144', 'train145', 'train146', 'train147', 'train148', 'train149', 'train150', 'train151', 'train152', 'train153', 'train154', 'train155', 'train156', 'train157', 'train158', 'train159', 'train160', 'train161', 'train162', 'train163', 'train164', 'train165', 'train166', 'train167', 'train168', 'train169', 'train170', 'train171', 'train172', 'train173', 'train174', 'train175', 'train176', 'train177', 'train178', 'train179', 'train180', 'train181', 'train182', 'train183', 'train184', 'train185', 'train186', 'train187', 'train188', 'train189', 'train190', 'train191', 'train192', 'train193', 'train194', 'train195', 'train196', 'train197', 'train198', 'train199', 'train200', 'train201', 'train202', 'train203', 'train204', 'train205', 'train206', 'train207', 'train208', 'train209', 'train210', 'train211', 'train212', 'train213', 'train214', 'train215', 'train216', 'train217', 'train218', 'train219', 'train220', 'train221', 'train222', 'train223', 'train224', 'train225', 'train226', 'train227', 'train228', 'train229', 'train230', 'train231', 'train232', 'train233', 'train234', 'train235', 'train236', 'train237', 'train238', 'train239', 'train240', 'train241', 'train242', 'train243', 'train244', 'train245', 'train246', 'train247', 'train248', 'train249', 'train250', 'train251', 'train252', 'train253', 'train254', 'train255', 'train256'};
predictors = inputTable(:, predictorNames);
response = inputTable.train257;
isCategoricalPredictor = [false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false];

% 执行交叉验证
partitionedModel = crossval(trainedClassifier.ClassificationSVM, 'KFold', 5);

% 计算验证预测
[validationPredictions, validationScores] = kfoldPredict(partitionedModel);

% 计算验证准确度
validationAccuracy = 1 - kfoldLoss(partitionedModel, 'LossFun', 'ClassifError');

然后,在以后的训练中我们可以直接利用此函数对数据进行训练,该函数返回值有两个,一个为返回的训练器,另一个为模型的准备度。

模型验证

上面的工作做完,接下来就要验证模型了,我们选取的验证字符为
test=[0,0,0,0,0,1,1,1,1,1]

[trainedClassifier, validationAccuracy] = trainClassifier(train);
yfit = trainedClassifier.predictFcn(train(595:605,:))

输出结果如下
yfit = 11×1
1
1
1
1
1
1
2
2
2
2

总结

到此为止,一个简单的书写数字分类识别器已经设计完了,其中的问题还有很多。各位如果想讨论,或想咨询,请给本人留言请留言或联系本人。QQ:2214564003QQ:2214564003
欢迎关注本人微信公众号,并发送"手写数字字符"获取手写数字字符,小编已经给大家准备好了。
在这里插入图片描述

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!