The Nature Conservancy Fisheries Monitoring

Es gibt keine Kurzfassung, da dies ein geschützter Beitrag ist.


Es gibt keine Kurzfassung, da dies ein geschützter Beitrag ist.


Es gibt keine Kurzfassung, da dies ein geschützter Beitrag ist.

Statoil Iceberg detection

  Result Wave: U-Net and then classifier on the created image. Learning GAN: Learning iceberg and no iceberg images works only, if the discriminator does not learn this difference for the fake outputs! Ressources Kaggle


OCR Network ist a stacked, bidirectional recursive neural network with logistic units. Dropout is applied between the RNN layers.   The tested networks are parameterized by: $n$ number of layers. $l_i, i=1,\dots,n$ Number of hidden nodes (forward and backward are equal). $d_i$ Dropout factor after hidden layer $i$. Type of the unit. Results Two RNN layer […]

Handwritten Text Detection

The first goal is to detect handwritten numbers (^[0-9.,/\-\+\’]+$) like:  and  and    Results Handwritten [DO]: only digits (0123456789) [RNN-1]: two stacked RNN’s with sigmoids and timepool. Batchsize=768 for each GPU. [WB=0.1]: Whitebalance with pctmax=0.1. default. [MB=768]: sample count per batch. [IWND=32,19]: Inputwindow height=32 and baseline-y at 19 (defaults). Results Machinewritten training net 92 with [TP=0.2]: […]

Toy Network

the following network is from the programming assignment 3 of the Hinton coursera course.   Forward $\def\R{\mathcal R}$ $h$ .. count of hidden units. $m$ .. count of samples in batch. $W^{0}\in\R^{h\times 256}$ .. Input to hidden weights. $W^1\in\R^{10\times h} $ .. Hidden to output weights. $Y\in\R^{10\times m}$ .. Targets $S^0\in\R^{256\times m}$ .. Input. $z^1=W^0 […]

Point of Interest Localization

The task is to find defined striking points in formulars. The poi are marked in the testimages: See also Facial Keypoints Detection for basic concepts. Network is based on vgg with ReLU. Convolution filters are 3×3 only. The building block $C_i$ consists of $c_i$ convolution layers. Resudial=yes means that the input of layer $c_i$ in the building […]

ICDAR 2017 Competition on Baseline Detection (cBAD)

  GEWONNEN!!!!! icdar17-cBAD-o-5 Complex dataset: Layoutanalysis need to be performed!  Results [IM] Inner mask only: Train mask only in inner 32×32 part. [TRM] Mask out invalid regions, keep text regions. [EL] Error layer. Default is dice. [ROT] Rotation of patches in train. Default +/- 180° [MLW] Maskline width. Default is $3$. [DM] Mask on dice error […]

Cross Detection

The task is to identify all crosses in a form. Crosses can have many shapes and obstruction. Also negatives are sometimes difficult as they look like a cross. Crosses can also be striked out:                For this task an U-Net is used to detect a mask. The mask is black and […]


Idea is to pretrain a residual network on the ImageNet database that can be used for other image recognition tasks to prevent overfitting. Global Parameters: Batchsize $2\times 256$ with map reduce. $224\times 224$ patches randomly placed in downscaled image with smaller side $256$ Adadelta $\rho=0.9$ ImageNet CL $1,281,167$ Training Images. $50.000$ images for validation. Batchsize […]

DSB 2017 - Detecting Cancer

DICOM Preprocessing First we have to understand DICOM an convert it into a format that can be used with .NET. The scanner codes the data in Hounsfield units: We use only data from $-1000$ (air) to $400$ (no bones) [2]. Further we need a mapping from worl coordinates to pixels. The following meta parameter in […]

Text Segmentation

The task is to find text in a form excluding the labels. Basic idea is to use an U-Net to find the mask. Three Types: Detect the baseline only. Detect the complete region of the words. Detect baseline in Mask and encode Fontheight above baseline in gray color. Net Arcitecture For type 1 and 2 […]


Analysis of the generic OCR network with UW3 dataset. Plain network Boostmap Warping X-Height Normalization. Results  The width of the samples is 100 in the beginning of the training. As the groundtruth increases, the width is increased to the full size. So learning is much faster. References Benchmarking of LSTM Networks Challenges and Methods 2013-breuel-high-performance-ocr-for-english-and-fraktur-using-lstm-networks

Spell Correction with RNN

Can we use a RNN for spell correction? Basic idea is to show th RNN a word letter by letter an define as output the corrected word. Trainingdata should be a database. The spelling errors can be created randomly with a distribution representing the error model of the OCR net. For example the following cases […]