CSCI-561 Foundations of Artificial

my wechat:Yooo932851

Don't hesitate to contact me

1. Overview

In this programming homework, you will implement a multi-layer perceptron (MLP) neural network and use it to classify hand-written digits shown in Figure 1. You can use numerical libraries such as Numpy/Scipy, but machine learning libraries areNOT allowed (including tensorflow (v1&v2), caffe, pytorch, torch, cxxnet, and mxnet). You need to implement feedforward/backpropagation as well as training process by yourselves.

2. Data Description

In this assignment you will use MNIST dataset. You can read its description from the url above. This dataset consists of four files:

1. Training set images, which contains 60,000 28 × 28 grayscale training images, each representing a single handwritten digit.

2. Training set labels, which contains the associated 60,000 labels for the training images.

3. Test set images, which contains 10,000 28 × 28 grayscale testing images, each representing a single handwritten digit.

4. Test set labels, which contains the associated 10,000 labels for the testing images.?

File 1 and 2 are the training set. File 3 and 4 are the test set. Each training and test instance in the MNIST database consists of a 28 × 28 grayscale image of a handwritten digit and an associated integer label indicating the digit that this image represents (0-9). Each of the 28 × 28 = 784 pixels of each of these images is represented by a single 8-bit color channel. Thus, the values each pixel can take on range from 0 (completely black) to 255 (28 ? 1, completely white).

If you are interested, the raw MNIST format is described in.

For your convenience, we will use the .csv version of the dataset for submission and grading. In order to access it, please download mnist.pkl.gz and mnist_csv3.py from HW3->resource->asnlib->public to your local machine and run the following command: python3 mnist_csv3.py

After that, you should be able to see File 1, 2, 3, 4. The format of our csv files will be described in theSection 3 Task description below.

You can train and test your own networks locally with the whole or partial dataset. When you submit, we provide a subset of MNIST for your training/testing (not for grading). We reserve the grading training/testing set (but it must be a subset of MNIST).

As an option, note that File 1 and 3 could be combined into File1+3, and File 2 and 4 can be combined into File2+4 (with the same index as File1+3). Viewed this way, the whole data will be contained in these two files: File1+3 contains all the images, and File2+4 contains all the labels of the images. One advantage of this is that one could partition the whole data into training and testing sets anyway that is desired. You may easily modify mnist_csv3.py or simply merge the .csv files to achieve this option.

3. Task description

Your task is to implement a multi-hidden-layer neural network learner (see model description part for details of neural network you need to implement), that will

(1) Construct a neural network classifier from the given labeled training data,

(2) Use the learned classifier to classify the unlabeled test data, and

(3) Output the predictions of your classifier on the test data into a file in thesame directory,

(4)Finish in 30 minutes (for both training your model and making predictions).

Your program will take three input files and produce one output file as follows:

run your_program train_image.csv train_label.csv test_image.csv

? test_predictions.csv

For example,

python3 NeuralNetwork.py train_image.csv train_label.csv test_image.csv

? test_predictions.csv

In other words, your algorithm file NeuralNetwork.*** will take training data, training labels, and testing data as inputs, and output your classification predictions on the testing data as output. In your implementation,please do not use any existing machine learning library call.

You must implement the algorithm yourself. Please develop your code yourself and do not copy from other students or from the Internet.

The format of ***_image.csv looks like:

a1, a2, a3, …… a784

b1, b2, b3, …… b784

……

Where x1, x2, and x3 are the pixels, so each row is an image. Each file contains at least one image and at most 60000 images.

The train_label.csv and your output test_predictions.csv will look like

1

0

2

5

… (A single column indicates the predicted class labels for each unlabeled sample in the input test file)

The format of your test_predictions.csv file is crucial. It has to be in theexact same

name and format so that it can be parsed correctly to compare with true labels by the AI auto-grading scripts automatically.

When we grade your algorithm, we will usehidden training data andhidden testing data (randomly picked from MNIST) instead of the testing data that was given to you for submission.

Your code will be autograded for technical correctness. Please name your file correctly, or you will wreak havoc on the autograder.The maximum running time to train and test a model is 30 minutes (for both training and testing), so please make sure your program finishes in 30 minutes.

As listed in Section 6, we have two sets to train and evaluate your model. Your model would be trained and evaluated on those two sets independently. The running time limitation is applied for training and evaluating onone of the sets.

4. Model description

The basic structure model of a neural network in this homework assignment is as Figure 2 below. The figure shows a 2-hidden-layer neural network. The input layer is one dimensional, you need to reshape input to 1-d by yourself. At each hidden layer, you need to use asigmoid activation function (see references below). Since it is a multi-class classification problem, you need to usesoftmax function (see references below) as activation at the final output layer to generate probability distribution of each class. For computing loss, you need to use thecross entropy loss function. (see references below)There is no specific requirement on the number of nodes in each layer, you need to choose them to make your neural network reach best performance. Also, the number of nodes in the input layer should be the number of features, and the number of nodes in the output layer should be the number of classes.

Figure 2: Example Network Configurations.

There are some hyper-parameters you need to tune to get better performance. You need to find the best hyper-parameters so that your neural network can get good performance on the given test data as well as on the hidden grading data.

- Learning rate: step size for update weights (e.g. weights = weights - learning * grads), different optimizers have different ways to use learning rate. (see reference in 2.1)

- Batch size: number of samples processed each time before the model is updated. The size of a batch must be more than or equal to one, and less than or equal to the number of samples in the training dataset. (e.g suppose your dataset is of 1000, and your batch size is 100, then you have 10 batches, each time you train one batch (100 samples) and after 10 batches, it trains all samples in your dataset.)

- Number of epoch: the number of complete passes through the training dataset (e.g. you have 1000 samples, 20 epochs means you loop this 1000 samples 20 times, suppose your batch size is 100, so in each epoch you train 1000/100 = 10 batches to loop the entire dataset and then you repeat this process 20 times)

- Number of units in each hidden layer Remember that the program has tofinish in 30 minutes, so choose your hyper-parameters wisely.

Learning Curve Graph (we will not grade it but

it may help)

In order to make sure your neural network actually learns something, You may need to make a plot to show the learning process of your neural networks. After every epoch (one epoch means

going through all the samples in your training data once), it may be a good idea to record your

accuracy on the training set and the validation set (it is just the test set we give you) and make a plot of those accuracy as shown in the figure on the right.

5. Implementation Guidance Suggested Steps

1.Split the dataset into batches

2.Initialize weights and bias

3.Select one batch of data and calculate forward pass - follow the basic structure of the neural network to compute output for each layer, you might need to cache output of each layer for the convenience of backward propagation.

4.Compute loss function - you need to use cross-entropy (logistic loss - see references above) as loss function

5.Backward propagation - use backward propagation (your implementation) to update hidden weights

6.Updates weights using optimization algorithms - there are many ways to update weights you can use plain SGD or advanced methods such as Momentum and Adam.

(but you can get full credit easily without any advanced methods)

7.Repeat 2,3,4,5,6 for all batches - after finishing this process for all batches (it just iterates all data points of the dataset), it is called ‘one epoch’.

8.Repeat 2,3,4,5,6,7 number of epochs times- You might need to train many epochs to get a good result. As an option, you may want to print out the accuracy of your network at the end of each epoch.

Tips

There are many techniques that can speed up the training process of your neural networks. Feel free to use them. For example, we suggest using vectorization such as Numpy instead of for loop in python. You can also

1. Try advanced optimizers such as SGD with momentum or Adam.

2. Try other weights initialization methods such as Xavier initialization.

3. Try dropout or batchnorm.

And so on, but youDO NOT really need them to achieve our accuracy goal. A “vanilla” or naive implementation with proper learning rate can work very well by itself.

DO NOT USE ANY existing machine learning library such as Tensorflow and Pytorch.

6. Submission and Grading

As described previously, we will provide 3 input files (train_image.csv train_label.csv test_image.csv) in your working path. Your program file should be named as NeuralNetwork.***. (if you are using python3 or C++11, name it as NeuralNetwork3.py/NeuralNetwork11.cpp) and output a file test_predictions.csv. You need to make sure the output file name is exactly the same.

The training/testing dataset will be different in submission and grading, but they are subsets from MNIST.

Grading would be conducted on two sets.

1. MNIST Set

a. Training:10,000 images that are randomly sampled from MNIST dataset.

b. Testing: 10,000 images that are randomly selected from MNIST dataset.

c. There is no overlap between the training and testing images.

2. TA Set:

a. Training:10,000 images that are randomly sampled from MNIST dataset.

b. Testing: 120 hand-written images created by Homework3 team.

Grading is based on your prediction accuracyon both testing sets. We would evaluate your model on those two sets separately. The final grade would be a weighted average of the credits for both testing sets:

Final Grade = 0.6 * Credit(MNIST set) + 0.4 * Credit(TA set)

On the MNIST set, we hope you can get at least 90% accuracy, any result better than 90% will get all credit for this set. Results between 50% and 90% will get 50% credit for this set, but if your accuracy is less than 50% you will get nothing for this set.

On the TA set, we hope you can get at least 55% accuracy, any result better than 55% will get all credit for this set. Results between 30% and 55% will get 50% credit for this set, but if your accuracy is less than 30% you will get nothing for this set.

Notice: 90% and 55% are not a hard goal, if your implementation is correct, you will find little extra work is needed to achieve the accuracy. In other words, if you cannot get close to the goal, there is a high possibility that your code has some problems.Directly loading pretrained weights of the Neural Network is prohibited.

As the TA set is newly created, we provide 10 examples as below.

Note those samples are just for illustrative purposes.

7. Academic Honesty and Integrity

All homework material is checked vigorously for dishonesty using several methods. All detected violations of academic honesty are forwarded to the Office of Student Judicial Affairs. To be safe, you are urged to err on the side of caution. Do not copy work from another student or off the web. Keep in mind that sanctions for dishonesty are reflected in your permanent record and can negatively impact your future success. As a general guide:

Do not copy code or written material from another student. Even single lines of code should not be copied.

Do not collaborate on this assignment. The assignment is to be solved individually.

Do not copy code off the web. This is easier to detect than you may think.

Do not share any custom test cases you may create to check your program’s behavior in more complex scenarios than the simplistic ones that are given.

Do not copy code from past students. We keep copies of past work to check for this. Even though this project differs from those of previous years, do not try to copy from homeworks of previous years.

Do not ask on Piazza how to implement some function for this homework, or how to calculate something needed for this homework.

Do not post code on Piazza asking whether or not it is correct. This is a violation of academic integrity because it biases other students who may read your post.

Do not post test cases on Piazza asking for what the correct solution should be.

Do ask the professor or TAs if you are unsure about whether certain actions constitute dishonesty. It is better to be safe than sorry.

DO NOT USE ANY existing machine learning library such as Tensorflow and Pytorch. Violation may cause a penalty to your credit.

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末巩割,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子哈街,更是在濱河造成了極大的恐慌篙贸,老刑警劉巖孵淘,帶你破解...
    沈念sama閱讀 218,122評論 6 505
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件姨蟋,死亡現(xiàn)場離奇詭異,居然都是意外死亡父虑,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,070評論 3 395
  • 文/潘曉璐 我一進(jìn)店門授药,熙熙樓的掌柜王于貴愁眉苦臉地迎上來频轿,“玉大人,你說我怎么就攤上這事烁焙『叫希” “怎么了?”我有些...
    開封第一講書人閱讀 164,491評論 0 354
  • 文/不壞的土叔 我叫張陵骄蝇,是天一觀的道長膳殷。 經(jīng)常有香客問我,道長九火,這世上最難降的妖魔是什么赚窃? 我笑而不...
    開封第一講書人閱讀 58,636評論 1 293
  • 正文 為了忘掉前任,我火速辦了婚禮岔激,結(jié)果婚禮上勒极,老公的妹妹穿的比我還像新娘。我一直安慰自己虑鼎,他們只是感情好辱匿,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,676評論 6 392
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著炫彩,像睡著了一般匾七。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上江兢,一...
    開封第一講書人閱讀 51,541評論 1 305
  • 那天昨忆,我揣著相機(jī)與錄音,去河邊找鬼杉允。 笑死邑贴,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的叔磷。 我是一名探鬼主播拢驾,決...
    沈念sama閱讀 40,292評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼世澜!你這毒婦竟也來了独旷?” 一聲冷哼從身側(cè)響起署穗,我...
    開封第一講書人閱讀 39,211評論 0 276
  • 序言:老撾萬榮一對情侶失蹤寥裂,失蹤者是張志新(化名)和其女友劉穎嵌洼,沒想到半個(gè)月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體封恰,經(jīng)...
    沈念sama閱讀 45,655評論 1 314
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡麻养,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,846評論 3 336
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了诺舔。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片鳖昌。...
    茶點(diǎn)故事閱讀 39,965評論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖低飒,靈堂內(nèi)的尸體忽然破棺而出许昨,到底是詐尸還是另有隱情,我是刑警寧澤褥赊,帶...
    沈念sama閱讀 35,684評論 5 347
  • 正文 年R本政府宣布糕档,位于F島的核電站,受9級特大地震影響拌喉,放射性物質(zhì)發(fā)生泄漏速那。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,295評論 3 329
  • 文/蒙蒙 一尿背、第九天 我趴在偏房一處隱蔽的房頂上張望端仰。 院中可真熱鬧,春花似錦田藐、人聲如沸荔烧。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,894評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽茴晋。三九已至,卻和暖如春回窘,著一層夾襖步出監(jiān)牢的瞬間诺擅,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 33,012評論 1 269
  • 我被黑心中介騙來泰國打工啡直, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留烁涌,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 48,126評論 3 370
  • 正文 我出身青樓酒觅,卻偏偏與公主長得像撮执,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個(gè)殘疾皇子舷丹,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,914評論 2 355

推薦閱讀更多精彩內(nèi)容