How to read MNIST data in C++?
I did some work with the MNIST data recently. Here's some code that I wrote in Java that should be pretty easy for you to port over:
import net.vivin.digit.DigitImage;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
/**
* Created by IntelliJ IDEA.
* User: vivin
* Date: 11/11/11
* Time: 10:07 AM
*/
public class DigitImageLoadingService {
private String labelFileName;
private String imageFileName;
/** the following constants are defined as per the values described at http://yann.lecun.com/exdb/mnist/ **/
private static final int MAGIC_OFFSET = 0;
private static final int OFFSET_SIZE = 4; //in bytes
private static final int LABEL_MAGIC = 2049;
private static final int IMAGE_MAGIC = 2051;
private static final int NUMBER_ITEMS_OFFSET = 4;
private static final int ITEMS_SIZE = 4;
private static final int NUMBER_OF_ROWS_OFFSET = 8;
private static final int ROWS_SIZE = 4;
public static final int ROWS = 28;
private static final int NUMBER_OF_COLUMNS_OFFSET = 12;
private static final int COLUMNS_SIZE = 4;
public static final int COLUMNS = 28;
private static final int IMAGE_OFFSET = 16;
private static final int IMAGE_SIZE = ROWS * COLUMNS;
public DigitImageLoadingService(String labelFileName, String imageFileName) {
this.labelFileName = labelFileName;
this.imageFileName = imageFileName;
}
public List<DigitImage> loadDigitImages() throws IOException {
List<DigitImage> images = new ArrayList<DigitImage>();
ByteArrayOutputStream labelBuffer = new ByteArrayOutputStream();
ByteArrayOutputStream imageBuffer = new ByteArrayOutputStream();
InputStream labelInputStream = this.getClass().getResourceAsStream(labelFileName);
InputStream imageInputStream = this.getClass().getResourceAsStream(imageFileName);
int read;
byte[] buffer = new byte[16384];
while((read = labelInputStream.read(buffer, 0, buffer.length)) != -1) {
labelBuffer.write(buffer, 0, read);
}
labelBuffer.flush();
while((read = imageInputStream.read(buffer, 0, buffer.length)) != -1) {
imageBuffer.write(buffer, 0, read);
}
imageBuffer.flush();
byte[] labelBytes = labelBuffer.toByteArray();
byte[] imageBytes = imageBuffer.toByteArray();
byte[] labelMagic = Arrays.copyOfRange(labelBytes, 0, OFFSET_SIZE);
byte[] imageMagic = Arrays.copyOfRange(imageBytes, 0, OFFSET_SIZE);
if(ByteBuffer.wrap(labelMagic).getInt() != LABEL_MAGIC) {
throw new IOException("Bad magic number in label file!");
}
if(ByteBuffer.wrap(imageMagic).getInt() != IMAGE_MAGIC) {
throw new IOException("Bad magic number in image file!");
}
int numberOfLabels = ByteBuffer.wrap(Arrays.copyOfRange(labelBytes, NUMBER_ITEMS_OFFSET, NUMBER_ITEMS_OFFSET + ITEMS_SIZE)).getInt();
int numberOfImages = ByteBuffer.wrap(Arrays.copyOfRange(imageBytes, NUMBER_ITEMS_OFFSET, NUMBER_ITEMS_OFFSET + ITEMS_SIZE)).getInt();
if(numberOfImages != numberOfLabels) {
throw new IOException("The number of labels and images do not match!");
}
int numRows = ByteBuffer.wrap(Arrays.copyOfRange(imageBytes, NUMBER_OF_ROWS_OFFSET, NUMBER_OF_ROWS_OFFSET + ROWS_SIZE)).getInt();
int numCols = ByteBuffer.wrap(Arrays.copyOfRange(imageBytes, NUMBER_OF_COLUMNS_OFFSET, NUMBER_OF_COLUMNS_OFFSET + COLUMNS_SIZE)).getInt();
if(numRows != ROWS && numRows != COLUMNS) {
throw new IOException("Bad image. Rows and columns do not equal " + ROWS + "x" + COLUMNS);
}
for(int i = 0; i < numberOfLabels; i++) {
int label = labelBytes[OFFSET_SIZE + ITEMS_SIZE + i];
byte[] imageData = Arrays.copyOfRange(imageBytes, (i * IMAGE_SIZE) + IMAGE_OFFSET, (i * IMAGE_SIZE) + IMAGE_OFFSET + IMAGE_SIZE);
images.add(new DigitImage(label, imageData));
}
return images;
}
}
How can I read the MNIST dataset with C++?
When working with the MNIST data set, I had the same problem that you had. I could read the labels, but the training and test set images were mostly bogus; the training set was filled almost entirely with 175, and the testing set was filled almost entirely with 0s (except for the first 6 images). Rebooting did not fix the problem and I was unable to determine why the file reading was not working correctly.
For anyone with this same problem, I would suggest instead using the data files located at http://cis.jhu.edu/~sachin/digit/digit.html. The data is already organized by number (no label/image association required), and the arrays of pixel values are simply encoded one after the other. Knowing that each array is 28x28 and that there are 1000 images for each number, you can easily write code to input the individual image arrays of pixel values.
Reading MNIST Database
That is how I did it:
public static class MnistReader
{
private const string TrainImages = "mnist/train-images.idx3-ubyte";
private const string TrainLabels = "mnist/train-labels.idx1-ubyte";
private const string TestImages = "mnist/t10k-images.idx3-ubyte";
private const string TestLabels = "mnist/t10k-labels.idx1-ubyte";
public static IEnumerable<Image> ReadTrainingData()
{
foreach (var item in Read(TrainImages, TrainLabels))
{
yield return item;
}
}
public static IEnumerable<Image> ReadTestData()
{
foreach (var item in Read(TestImages, TestLabels))
{
yield return item;
}
}
private static IEnumerable<Image> Read(string imagesPath, string labelsPath)
{
BinaryReader labels = new BinaryReader(new FileStream(labelsPath, FileMode.Open));
BinaryReader images = new BinaryReader(new FileStream(imagesPath, FileMode.Open));
int magicNumber = images.ReadBigInt32();
int numberOfImages = images.ReadBigInt32();
int width = images.ReadBigInt32();
int height = images.ReadBigInt32();
int magicLabel = labels.ReadBigInt32();
int numberOfLabels = labels.ReadBigInt32();
for (int i = 0; i < numberOfImages; i++)
{
var bytes = images.ReadBytes(width * height);
var arr = new byte[height, width];
arr.ForEach((j,k) => arr[j, k] = bytes[j * height + k]);
yield return new Image()
{
Data = arr,
Label = labels.ReadByte()
};
}
}
}
Image
class:
public class Image
{
public byte Label { get; set; }
public byte[,] Data { get; set; }
}
Some extension methods:
public static class Extensions
{
public static int ReadBigInt32(this BinaryReader br)
{
var bytes = br.ReadBytes(sizeof(Int32));
if (BitConverter.IsLittleEndian) Array.Reverse(bytes);
return BitConverter.ToInt32(bytes, 0);
}
public static void ForEach<T>(this T[,] source, Action<int, int> action)
{
for (int w = 0; w < source.GetLength(0); w++)
{
for (int h = 0; h < source.GetLength(1); h++)
{
action(w, h);
}
}
}
}
Usage:
foreach (var image in MnistReader.ReadTrainingData())
{
//use image here
}
or
foreach (var image in MnistReader.ReadTestData())
{
//use image here
}
How to read and display MNIST dataset?
There are two problems here. (1) You need to skip the first row because they are labels. (1x1), (1x2) and etc. (2) You need int64 data type. The code below will solve both. next(csvreader) skips the first row.
import numpy as np
import csv
import matplotlib.pyplot as plt
with open('./mnist_test.csv', 'r') as csv_file:
csvreader = csv.reader(csv_file)
next(csvreader)
for data in csvreader:
# The first column is the label
label = data[0]
# The rest of columns are pixels
pixels = data[1:]
# Make those columns into a array of 8-bits pixels
# This array will be of 1D with length 784
# The pixel intensity values are integers from 0 to 255
pixels = np.array(pixels, dtype = 'int64')
print(pixels.shape)
# Reshape the array into 28 x 28 array (2-dimensional array)
pixels = pixels.reshape((28, 28))
print(pixels.shape)
# Plot
plt.title('Label is {label}'.format(label=label))
plt.imshow(pixels, cmap='gray')
plt.show()
How to read pixels from MNIST digit database and create the iplimage
You have:
unsigned char temp=0;
...
file.read((char*)&temp,sizeof(temp));
With that you are reading a byte into a single char, and overwriting it with each subsequent byte in the file.
When you do this:
create_image(size,3, &temp, i);
temp
is only one character long and just contains the last byte in the file, so your image ends up being just whatever happens to be in memeory after temp.
You need to allocate an array to hold the image data and increment a pointer into it as you fill it with data.
Also you are creating a 3 channel image, but the MNIST data is only single channel, right?
Also,
imghead->imageData=(char *)data;
should be
cvSetData(imghead, data, size.width)
and
unsigned char *arr[28][28];
should be
unsigned char arr[28][28];
how does mnist file is read through following code.
Please don't mix C and C++ unless it is absolutely necessary. The underlying confusion is that the call to fread
"moves" the file pointer through the file for you. As @RetiredNinja noted, you are advancing the file pointer 4 bytes at a time. That's how it "knows" how to read the next value even though you didn't tell it to explicitly. You can read all about file pointers here.
An implementation using slightly more idiomatic C++ could be
#include <fstream>
#include <iostream>
#include <algorithm>
int readFlippedInteger(std::istream &in) {
char temp[sizeof(int)];
in.read(temp, sizeof(int));
std::reverse(temp, temp+sizeof(int));
return *reinterpret_cast<int*>(temp);
}
int main() {
std::ifstream fin("MNIST/train-images.idx3-ubyte", std::ios::binary);
if (!fin) {
std::cerr << "Could not open file\n";
return -1;
}
// delcare function;
int magicNumber = readFlippedInteger(fin);
int numImages = readFlippedInteger(fin);
int numRows = readFlippedInteger(fin);
int numCols = readFlippedInteger(fin);
std::cout << magicNumber << std::endl // 2051
<< numImages << std::endl // 60000
<< numRows << std::endl // 28
<< numCols << std::endl; // 28
}
An implementation that uses a user-defined stream manipulator is left as an exercise for the reader.
How to understand MNIST Binary converter in c++?
1-What is the algorithm to convert the data-set in c++ with help of ifstream?
This function read a file (t10k-images-idx3-ubyte.gz
) as follow:
- Read a magic number and adjust endianness
- Read number of images and adjust endianness
- Read number rows and adjust endianness
- Read number of columns and adjust endianness
- Read all the given images x rows x columns characters (but loose them).
The function use normal int
and always switch endianness, that means it target a very specific architecture and is not portable.
How can we go to the specific offset for example 0004 and read for example 32 bit integer and put it to an integer variable.
ifstream
provides a function to seek to a given position:
file.seekg( posInBytes, std::ios_base::beg);
At the given position, you could read the 32-bit integer:
int32_t val;
file.read ((char*)&val,sizeof(int32_t));
2- What the function
reverseInt
is doing?
This function reverse order of the bytes of an int
value:
Considering an integer of 32bit like Qapplication in Non-Main Thread Const Unsigned Char * to Std::String Why Does Not a Template Template Parameter Allow 'Typename' After the Parameter List How to Include Correctly -Wl,-Rpath,$Origin Linker Argument in a Makefile How to Declare Array with Auto What Is the Modern, Correct Way to Do Type Punning in C++ Why Can't I Static_Cast Between Char * and Unsigned Char * This Declaration Has No Storage Class or Type Specifier in C++ How Does a C/C++ Compiler Find the Definitions of Prototypes in Header Files How to Use Lambda Auto Parameters in C++11 PDF Specifications for Coders: Adobe or Iso Cross Platform Sleep Function for C++ Reading and Writing to Usb (Hid) Interrupt Endpoints on MAC Ensuring C++ Doubles Are 64 Bits How to Use a C++11 to Program the Arduino Determining If an Unordered Vector<T> Has All Unique Elements Qapplication in Non-Main Thread Const Unsigned Char * to Std::String Why Does Not a Template Template Parameter Allow 'Typename' After the Parameter List How to Include Correctly -Wl,-Rpath,$Origin Linker Argument in a Makefile How to Declare Array with Auto What Is the Modern, Correct Way to Do Type Punning in C++ Why Can't I Static_Cast Between Char * and Unsigned Char * This Declaration Has No Storage Class or Type Specifier in C++ How Does a C/C++ Compiler Find the Definitions of Prototypes in Header Files How to Use Lambda Auto Parameters in C++11 PDF Specifications for Coders: Adobe or Iso Cross Platform Sleep Function for C++ Reading and Writing to Usb (Hid) Interrupt Endpoints on MAC Ensuring C++ Doubles Are 64 Bits How to Use a C++11 to Program the Arduino Determining If an Unordered Vector<T> Has All Unique ElementsHow to Read Mnist Data in C++cccccccc
, it return the integer
Related Topics
.
Related Topics
This is useful for normalizing endianness, however, it is probably not very portable, as int
might not be 32bit (but e.g. 16bit or 64bit)
Related Topics
Qapplication in Non-Main Thread
Const Unsigned Char * to Std::String
Why Does Not a Template Template Parameter Allow 'Typename' After the Parameter List
How to Include Correctly -Wl,-Rpath,$Origin Linker Argument in a Makefile
How to Declare Array with Auto
What Is the Modern, Correct Way to Do Type Punning in C++
Why Can't I Static_Cast Between Char * and Unsigned Char *
This Declaration Has No Storage Class or Type Specifier in C++
How Does a C/C++ Compiler Find the Definitions of Prototypes in Header Files
How to Use Lambda Auto Parameters in C++11
PDF Specifications for Coders: Adobe or Iso
Cross Platform Sleep Function for C++
Reading and Writing to Usb (Hid) Interrupt Endpoints on MAC
Ensuring C++ Doubles Are 64 Bits
How to Use a C++11 to Program the Arduino
Determining If an Unordered Vector<T> Has All Unique Elements