My Account Log in

4 options

Deep learning for computer vision : expert techniques to train advanced neural networks using TensorFlow and Keras / Rajalingappaa Shanmugamani.

EBSCOhost Academic eBook Collection (North America) Available online

View online

Ebook Central Academic Complete Available online

View online

Knovel Optics and Photonics Academic Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Shanmugamani, Rajalingappaa, author.
Language:
English
Subjects (All):
Artificial intelligence--Research.
Artificial intelligence.
Neural networks (Computer science).
Physical Description:
1 online resource (290 pages) : illustrations
Edition:
1st edition
Place of Publication:
Birmingham, England : Paths International Ltd, 2018.
System Details:
text file
Biography/History:
Shanmugamani Rajalingappaa: Rajalingappaa Shanmugamani is currently working as an Engineering Manager for a Deep learning team at Kairos. Previously, he worked as a Senior Machine Learning Developer at SAP, Singapore and worked at various startups in developing machine learning products. He has a Masters from Indian Institute of TechnologyMadras. He has published articles in peer-reviewed journals and conferences and submitted applications for several patents in the area of machine learning. In his spare time, he coaches programming and machine learning to school students and engineers.
Summary:
Learn how to model and train advanced neural networks to implement a variety of Computer Vision tasks About This Book Train different kinds of deep learning model from scratch to solve specific problems in Computer Vision Combine the power of Python, Keras, and TensorFlow to build deep learning models for object detection, image classification, similarity learning, image captioning, and more Includes tips on optimizing and improving the performance of your models under various constraints Who This Book Is For This book is targeted at data scientists and Computer Vision practitioners who wish to apply the concepts of Deep Learning to overcome any problem related to Computer Vision. A basic knowledge of programming in Python—and some understanding of machine learning concepts—is required to get the best out of this book. What You Will Learn Set up an environment for deep learning with Python, TensorFlow, and Keras Define and train a model for image and video classification Use features from a pre-trained Convolutional Neural Network model for image retrieval Understand and implement object detection using the real-world Pedestrian Detection scenario Learn about various problems in image captioning and how to overcome them by training images and text together Implement similarity matching and train a model for face recognition Understand the concept of generative models and use them for image generation Deploy your deep learning models and optimize them for high performance In Detail Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision. Computer Vision is the science of understanding and manipulating images, and finds enormous applications in the areas of robotics, automation, and so on. This book will also show you, with practical examples, how to develop Computer Vision applications by leveraging the power of deep learning. In this book, you will learn different techniques related to object classification, object detection, image segmentation, captioning, image generation, face analysis, and more. You will also explore their applications using popular Python libraries such as TensorFlow and Keras. This book will help you master state-of-the-art, deep learning algorithms and their implementation. Style and approach This book will teach advanced techniques for Computer Vision, applying the deep learning model in reference to various datasets. Downloading the example code for this...
Contents:
Cover
Copyright and Credits
Packt Upsell
Foreword
Contributors
Table of Contents
Preface
Chapter 1: Getting Started
Understanding deep learning
Perceptron
Activation functions
Sigmoid
The hyperbolic tangent function
The Rectified Linear Unit (ReLU)
Artificial neural network (ANN)
One-hot encoding
Softmax
Cross-entropy
Dropout
Batch normalization
L1 and L2 regularization
Training neural networks
Backpropagation
Gradient descent
Stochastic gradient descent
Playing with TensorFlow playground
Convolutional neural network
Kernel
Max pooling
Recurrent neural networks (RNN)
Long short-term memory (LSTM)
Deep learning for computer vision
Classification
Detection or localization and segmentation
Similarity learning
Image captioning
Generative models
Video analysis
Development environment setup
Hardware and Operating Systems - OS
General Purpose - Graphics Processing Unit (GP-GPU)
Computer Unified Device Architecture - CUDA
CUDA Deep Neural Network - CUDNN
Installing software packages
Python
Open Computer Vision - OpenCV
The TensorFlow library
Installing TensorFlow
TensorFlow example to print Hello, TensorFlow
TensorFlow example for adding two numbers
TensorBoard
The TensorFlow Serving tool
The Keras library
Summary
Chapter 2: Image Classification
Training the MNIST model in TensorFlow
The MNIST datasets
Loading the MNIST data
Building a perceptron
Defining placeholders for input data and targets
Defining the variables for a fully connected layer
Training the model with data
Building a multilayer convolutional network
Utilizing TensorBoard in deep learning
Training the MNIST model in Keras
Preparing the dataset
Building the model.
Other popular image testing datasets
The CIFAR dataset
The Fashion-MNIST dataset
The ImageNet dataset and competition
The bigger deep learning models
The AlexNet model
The VGG-16 model
The Google Inception-V3 model
The Microsoft ResNet-50 model
The SqueezeNet model
Spatial transformer networks
The DenseNet model
Training a model for cats versus dogs
Preparing the data
Benchmarking with simple CNN
Augmenting the dataset
Augmentation techniques
Transfer learning or fine-tuning of a model
Training on bottleneck features
Fine-tuning several layers in deep learning
Developing real-world applications
Choosing the right model
Tackling the underfitting and overfitting scenarios
Gender and age detection from face
Fine-tuning apparel models
Brand safety
Chapter 3: Image Retrieval
Understanding visual features
Visualizing activation of deep learning models
Embedding visualization
Guided backpropagation
The DeepDream
Adversarial examples
Model inference
Exporting a model
Serving the trained model
Content-based image retrieval
Building the retrieval pipeline
Extracting bottleneck features for an image
Computing similarity between query image and target database
Efficient retrieval
Matching faster using approximate nearest neighbour
Advantages of ANNOY
Autoencoders of raw images
Denoising using autoencoders
Chapter 4: Object Detection
Detecting objects in an image
Exploring the datasets
ImageNet dataset
PASCAL VOC challenge
COCO object detection challenge
Evaluating datasets using metrics
Intersection over Union
The mean average precision
Localizing algorithms
Localizing objects using sliding windows
The scale-space concept.
Training a fully connected layer as a convolution layer
Convolution implementation of sliding window
Thinking about localization as a regression problem
Applying regression to other problems
Combining regression with the sliding window
Detecting objects
Regions of the convolutional neural network (R-CNN)
Fast R-CNN
Faster R-CNN
Single shot multi-box detector
Object detection API
Installation and setup
Pre-trained models
Re-training object detection models
Data preparation for the Pet dataset
Object detection training pipeline
Training the model
Monitoring loss and accuracy using TensorBoard
Training a pedestrian detection for a self-driving car
The YOLO object detection algorithm
Chapter 5: Semantic Segmentation
Predicting pixels
Diagnosing medical images
Understanding the earth from satellite imagery
Enabling robots to see
Datasets
Algorithms for semantic segmentation
The Fully Convolutional Network
The SegNet architecture
Upsampling the layers by pooling
Sampling the layers by convolution
Skipping connections for better training
Dilated convolutions
DeepLab
RefiNet
PSPnet
Large kernel matters
DeepLab v3
Ultra-nerve segmentation
Segmenting satellite images
Modeling FCN for segmentation
Segmenting instances
Chapter 6: Similarity Learning
Algorithms for similarity learning
Siamese networks
Contrastive loss
FaceNet
Triplet loss
The DeepNet model
DeepRank
Visual recommendation systems
Human face analysis
Face detection
Face landmarks and attributes
The Multi-Task Facial Landmark (MTFL) dataset
The Kaggle keypoint dataset
The Multi-Attribute Facial Landmark (MAFL) dataset
Learning the facial key points
Face recognition.
The labeled faces in the wild (LFW) dataset
The YouTube faces dataset
The CelebFaces Attributes dataset (CelebA)
CASIA web face database
The VGGFace2 dataset
Computing the similarity between faces
Finding the optimum threshold
Face clustering
Chapter 7: Image Captioning
Understanding the problem and datasets
Understanding natural language processing for image captioning
Expressing words in vector form
Converting words to vectors
Training an embedding
Approaches for image captioning and related problems
Using a condition random field for linking image and text
Using RNN on CNN features to generate captions
Creating captions using image ranking
Retrieving captions from images and images from captions
Dense captioning
Using RNN for captioning
Using multimodal metric space
Using attention network for captioning
Knowing when to look
Implementing attention-based image captioning
Chapter 8: Generative Models
Applications of generative models
Artistic style transfer
Predicting the next frame in a video
Super-resolution of images
Interactive image generation
Image to image translation
Text to image generation
Inpainting
Blending
Transforming attributes
Creating training data
Creating new animation characters
3D models from photos
Neural artistic style transfer
Content loss
Style loss using the Gram matrix
Style transfer
Generative Adversarial Networks
Vanilla GAN
Conditional GAN
Adversarial loss
Image translation
InfoGAN
Drawbacks of GAN
Visual dialogue model
Algorithm for VDM
Generator
Discriminator
Chapter 9: Video Classification
Understanding and classifying videos
Exploring video classification datasets
UCF101
YouTube-8M
Other datasets.
Splitting videos into frames
Approaches for classifying videos
Fusing parallel CNN for video classification
Classifying videos over long periods
Streaming two CNN's for action recognition
Using 3D convolution for temporal learning
Using trajectory for classification
Multi-modal fusion
Attending regions for classification
Extending image-based approaches to videos
Regressing the human pose
Tracking facial landmarks
Segmenting videos
Captioning videos
Generating videos
Chapter 10: Deployment
Performance of models
Quantizing the models
MobileNets
Deployment in the cloud
AWS
Google Cloud Platform
Deployment of models in devices
Jetson TX2
Android
iPhone
Other Books You May Enjoy
Index.
Notes:
Includes index.
Description based on online resource; title from PDF title page (EBC, viewed February 22, 2018).
ISBN:
9781523116751
1523116757
9781788293358
1788293355
OCLC:
1022793819

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account