Mobile Vision Technology: Making sense of the visual world anytime, anywhere!

by Taniya Arya

Going mobile is no longer a luxury for organizations; it’s a necessity for any business hoping to stay relevant in today’s world. Technological advancements are all around us and they are changing the world faster than we can keep up.

An interesting trend is that the exponential growth in smartphones adoption and their state-of-the-art capabilities is increasingly being exploited to bring Vision-based Mobile Applications at the forefront of industries seeking the mobile-first approach. Applications using technologies like Text Recognition, Face Detection, Augmented Reality, Barcode Scanners amongst others are transforming the way we communicate with the world around us.

Before taking a plunge into the powerful APIs based on Mobile Vision technology, let us first understand how Computer Vision and Machine Learning will play a crucial role in the next phase of evolution of mobile devices.

Computer Vision is basically interpreting images or making sense of visual data. It involves capturing and employing Machine Learning to make sense of this data. Machine Learning, to put it simply, makes computers learn without being explicitly programmed. Mobile Vision is derived from the concept of Computer Vision.

“64% organizations found that mobile is essential for their business technology agenda”

Red Hat Mobility Maturity Survey

Mobile devices such as smartphones and tablets are nowadays equipped with high-resolution cameras, powerful processors, and intelligent sensors that is helping spring new Mobile Vision based applications. Google’s Tango Project is the perfect example of an exciting Mobile Vision technology. It maps out a complete 3-D structure of an area by just pointing a tablet around the area! Such powerful Mobile Vision APIs will help develop high-profile apps that will make sense of the visual data through real-time on-device vision technology.

Mobile Vision Technology

Many forward-thinking companies are embracing Mobile Vision technologies that work entirely on mobile rather than on the PC, the GPU, or the hard drive. This not only helps them interpret the visual world on the go, but also supports agility in the business model.

Mobile Vision

Here are some powerful Mobile Vision APIs, which can be applied separately or in conjunction with each other for mobile app development:

Face detection with Mobile Vision APIs: Face detection is an advanced API specifically designed to detect human faces in images and videos for editing purposes. This intelligent API can even detect faces at different orientations. Moreover, specific features can also be detected on facial parts, such as the eyes, the nose, and the mouth.

Barcode detection with Mobile Vision API: It is yet another exciting Mobile Vision API that reads and decodes a wide array of bar code types in an easy and prompt manner. The barcode type represents a single recognized barcode and its corresponding value.

Text recognition with Mobile Vision API: This API is extremely helpful in detecting text in images and video streams and then recognizing the text present therein in real-time on the device.

Coming to our company- KritiKal Solutions and its expertise in Mobile Vision, we have developed an intelligent in-house Optical Character Recognition (OCR) engine, which has powered various products and applications like vehicle license plate recognition, industrial inspection, container text identification, document digitization etc. This OCR engine gives a computer the ability to read text that appears in an image and has algorithms to make sense of the signs, pages of text, articles, or of any other place that text appears as part of the image. KritiKal persistently works on building powerful and reliable OCR capability that functions well with a wide range of Android devices without increasing the size of the app.

There is no denying the fact that the power of Mobile Vision can be your company’s springboard to growth and transformation. So, let’s jump on the terra incognita and explore the powerful capabilities to locate and describe visual objects on the go.

Advertisements

Deep Learning: Fueling The Computing Industry To Bring About A Revolution!

by Taniya Arya

I am sure that you have browsed through the Artificial Intelligence aka AI enabled tech gadgets that were showcased at the Consumer Electronics Show (CES) this year. I did and I was fascinated by the new technologies that were showcased at the grand tech event, right from the Smart Autonomous Cars to the Delightful Robots to the powerful Lego Boost Robotic Kits that can teach coding. This blog post talks about the latest AI advancement- Deep Learning, which is fueling the computing industry and will completely transform the corporate architecture in the coming years.

deep_learning

Deep learning is creating efficiencies in our power grids, percolating into our smart gadgets, gaining momentum across healthcare, augmenting our agricultural productions, and interestingly enough helping us find solutions pertaining to weather change.

Google by putting its DeepMind artificial intelligence in charge of its data center facilities, gets about 40% reduction in power consumption.

The Concept Demystified

Deep Learning is a subset of Machine Learning, which in turn is a subset of Artificial Intelligence. This whole architecture incorporates most logic and rule-based systems designed to solve problems. Machine Learning under the AI field encompasses a suite of algorithms that sift through data to improve the decision-making process. And, within machine learning you have deep learning, which can make sense of this data using multiple layers of abstraction.

deep-learning-img1

The AI Revolution

Decades-old discoveries in the field of artificial neurons are now transforming the computing industry by bringing massive disruption. Not to mention that AI capabilities are fueling innovation at an unprecedented rate and quantum leaps have happened in the quality of a multitude of everyday technologies. Most obviously, the speech-recognition systems on our smart devices work more flawlessly than they used to. Nowadays, when you use a voice command to connect with your spouse, you reach them and not your angry boss!

Deep Learning is an exciting tool that is supporting a host of industries creating cutting-edge AI applications, right from self-driving cars to speech-recognition systems. – Andrew Ng, Chief Scientist at Baidu

In today’s milieu, one can seamlessly interact with computers by simply talking to them, whether it’s Amazon’s Alexa, Apple’s Siri, Microsoft’s Cortana or Google’s tremendous voice-responsive features. Machine translation and other forms of natural language processing have become far better with continuous advancements happening in the field of artificial intelligence. In line with voice recognition, advancements have also happened in the field of image recognition. Tech pioneers having prowess in deep learning, are building high-profile applications that have features that let you search or organize collections of images with no identifying labels. You can ask to be shown, all the ones that have snow in them or even something as subtle as say the shadow of buildings.

Tech behemoths are increasingly aligning towards deep learning to unleash improvements in robotics, autonomous drones, and of course driver-less cars. For example, Tesla, Baidu and Alphabet are all testing prototypes of autonomous vehicles on roads. But, have you ever realized that all these breakthroughs are the same breakthrough? Deep Learning, that’s right!

The science behind deep learning, also referred to as deep neural networks dates back to the 1950s, however many of the breakthrough researches and inventions happened in 1980s and 1990s. The most exciting thing about the neural network is that no human programmed a computer to perform any of the feats mentioned above. In fact, the programmers fed a computer with a learning algorithm, exposed it to zettabytes of data (in form of audio, video and images) to train it, and then allowed the computer to figure out for itself- how to identify specific objects, words, images, etc.

The Bottom Line

Evolutions and advancements in raw computing power have made deep learning a reality and not just an academic thing. Research Firm CB Insights proclaims that equity funding of AI startup companies reached an all-time high last quarter of more than $1 billion. Venture Capitalists who weren’t even aware of the deep learning technology five years back, today are circumspect about startups that don’t have expertise in it. We are now living in a world where it is going to be inevitable for tech-firms building high-profile software applications to not have a deep learning arm. The day is not far when people will demand, “where is your natural language processing version?” or “How do I talk to your application? Because I don’t want to tab across menus.” Thus, it is needless to say that deep learning will soon have a major influence in all aspects of our life and the more it’s used the more charge it will get.

The Internet of Things 101

by Taniya Arya

Think about a world where every device in your workspace, home, and car is connected. A world where the coffee starts brewing the moment the morning alarm goes off, lights automatically turn on when you enter the living room, grocery comes at your doorstep when your stocks are running low, and the door automatically gets locked when a stranger approaches the gate. Well, this is the world that Internet of Things (IoT) can make into a reality.

As the cost of connecting devices is gradually decreasing, more and more devices are being created with Wi-Fi capabilities and in-built sensors to create value from things.

According to International Data Corporation (IDC), the market for wearable devices will experience a growth rate of 20.3%, culminating in 213.6 million units shipped in 2020.

network-782707_1280-2

So, what is the Internet of Things exactly?

IoT is a massive network of devices connected to the Internet. This includes anything ranging from jet engines, tablets, smartphones to washing machines, coffee makers, wearable devices; anything that has built-in sensors. To put it simply, IoT is the concept of connecting any device with turn on and turn off capabilities to the Internet or to each other. The main purpose of IoT is to exchange and collect data to create value.

Quite indisputably, the need for connected infrastructure is surging at an exciting speed across disparate industries.  IoT not only unravels a new dimension of services that improves the quality of life of consumers but also increases the profitability of enterprises. For consumers- IoT has the capability to deliver solutions that significantly augment health, security, energy efficiency and other aspects of life. For enterprises- IoT can bolster solutions that enhance decision-making capabilities of industries serving across different verticals.

As per a research conducted by Gartner, it was found that 30% of our interactions with technology will be through internet-connected machines or smart devices by 2018.

The new internet-ready devices are generating terabytes or petabytes of information that can considerably augment enterprise applications, making them more attentive towards what’s happening in real-time. This includes a horde of physical objects connecting to millions of smart gadgets such as mobile phones, tablets, wearables, connected cars and more.

By 2020, Technology research giant, Gartner Inc. forecasts that about 250 million cars will be connected, and the number of wireless connected devices is set to exceed 100 million objects or about 26 smart objects for every person on the planet.

Application areas for the Internet of Things

Smart Homes: Homes and buildings are being powered by wireless sensors to gather data about movement, heat, light and use of space. Right from augmenting security to reducing the cost associated with maintenance and energy, KritiKal offers a wide range of innovative IoT technologies to develop smart home applications that monitor and control intelligent buildings and smart homes.

Automotive: With the help of sensor-driven applications- road accidents, traffic congestion, and emergency service costs can be reduced. A wireless or connected transit infrastructure enables a safer, more efficient driving experience to people on and off the road. KritiKal provides innovative IoT enabled solutions for automobiles.

Wearables: Over the past few years, wearable technology has become popular around the world. Right from fitness trackers to smart watches, IoT gives consumers a better control of their lives. KritiKal develops smart ultra-low power IoT solutions for the wearables vertical.

In short, the Internet of Things bridges the gap between the cyber and the physical world. This, in turn, enables businesses to reinvent how products and services are conceptualized and delivered.

Mind Mapping Tools: What and Why?

by Vivek Singh

The beginning of a new project is often the time when all the stakeholders involved put on their thinking hats and brainstorm (and for good reason). The logical next step to this is putting down all the ideas and thoughts about the project in an organized and comprehensive manner. Structured ideas enable all interested to clearly analyze, comprehend, and prioritize possible next steps.

During the early part of my career, I always thought that coding was the single most important part of developing a mobile application. But with time as I started gaining experience, I was exposed to the different phases of the app development process (SDLC). It was then that I gained the understanding that coding an app is not self-sufficient; there are other processes that are as crucial as coding.

So what are these other processes about and why do we need them? The answer is very simple – for clarity’s sake! The clearer we are about the product, the more efficiently we can develop it.

Considering the present day scenario, do we really have the time to read lengthy pages of documents when multiple tasks compete for our attention simultaneously? Is capturing information from long documents a feasible way, considering the fast paced delivery environment? Can we really map such huge documents in a manner that is able to efficiently merge project teams and clients on a single platform?

So what can be done and what are the possible alternatives? One possible solution is optimizing the way we do our jobs by making use of the different mind mapping tools available in market. Mind mapping is a visual way of representing ideas and concepts in a way that shows the relationship between these ideas.

In this post, I am mentioning the mind mapping tool that I use frequently- ‘XMind’. XMind makes  not only my job but my life also easy. This tool helps map ideas onto a single screen, which eventually allows one to absorb all the thoughts in one go. The software supports all Microsoft Office formats allowing easy export of mind maps created inside XMind.

Here are some of the major features and benefits of this tool:

  • Rich set of different visualization styles
  • Allows sharing of created mind maps via their website
  • A number of templates to help you get started
  • Different icons and symbols
  • Allows deriving and mapping of requirements and functionality
  • Helps prioritize and schedule the structure for client meetings
  • Enables the client in getting a clear vision of the app
  • Features like Gantt charts, a presentation mode, export features, audio notes, a merge feature, privately online sharing, etc.
  • Most importantly- helps you in landing the project!

Here is a screenshot that describes the tool:

mindmapping

Best of all, the application is completely free and open source. If you do have some cash, go ahead and buy the paid commercial ‘Pro’ version which offers additional features including import/export features. The other mind mapping tools that have gained popularity are- Mindjet, Coggle, Freemind, and Mindnote. So go ahead, explore these tools and simplify your life!

Voice & Data Multiplexing Card

Overview

A leading public sector unit under Ministry of Defence, Government of India required a Voice and Data multiplexing card to be used in a Digital Electronic Private Automatic Branch Exchange (EPABX) system. The system should multiplex pulse code modulated (PCM) voice traffic and the ethernet data between two points over an E1 link.

KritiKal successfully delivered higher capacity, optimal cost and easily upgradable system along with support for hardware testing, environmental testing, complete integration & interoperability testing.

Our Solution

  • Customized system to bridge four Ethernet LANs over the existing E1 based telecom links by mapping of Ethernet frames in TDM as per HDLC (RFC1662).
  • KritiKal’s domain expertise and in-house IPs , with intimate knowledge of Linux Kernel networking enabled a technically viable, cost optimum solution in an aggressive time-frame. In-house product soft IP Core, PDH Framer was deployed for E1 Framing.
  • For transmission & reception over existing E1 based telecom links, user configurable ‘n’ x 64 Kbps voice traffic channels were mapped onto E1 format over existing E1 based telecom links.
  • Efficient and optimized FPGA design
  • CAS, CAS MFR2, DTMF signaling protocols were also supported complying to the ITU-T standard recommendations
  • Real Time Computation of Complex DSP functions was required to implement E1 signaling protocols on voice channels
  • Provision of working at 5V supply apart from the standard telecoms voltage levels of -48V
  • Hot-plugging into live backplane
  • E1 interface as per ITU-T G.703, G.704 and G732 hardware recommendations

Hardware Features

  • Exports bidirectional HDLC channel of 64 kbps for exchange of control and configuration information
  • Four Ethernet ports – 10/100 Mbps
  • MPC8321 processor used
  • 64MB DDR
  • NOR flash size – 4MB
  • FPGA used: Lattice XPS-17E
  • Linux v 2.6.XX Kernel with in-built Ethernet bridging utilities for implementing Ethernet encapsulation over E1. In-house product soft IP Core, PDH Framer used for E1 Framing

Software Features

  • Implements R2 line signaling as per ITU-T recommendation Q.421, Q.422 and Q.424
  • Implements R12 inter-register signaling as per ITU-T Q.440, Q.441 recommendations
  • Supports CAS Decadic, CAS DTMF, CAS MFR2 and ITU-T G.832 signaling protocols. Efficient and optimized implementation of complex DSP algorithms for generation and detection of DTMF and MFC signals supporting 120 simultaneous subscriber calls

 Environmental Features

  • Industrial grade components (-20 degC to +70 degC).
  • Vibration resistant.
  • Surge and spike protection.
  • EMI / ESD tolerant.

  Design Tools

  • Design environment for FPGA: Lattice IspLever.
  • Hardware tool: ORCAD.
  • PCB Development: CADStar.
  • Software tool: Linux based Makefile project.

Mosaicing Lite

Multi-purpose mosaic generation engine

Overview

A leading defense Lab required a high-end processing system which would take input video and flight data and produce high quality mosaics of the same so as to help the defense personnel in interpreting the data.

Scope

The Mosaicing System is a high-end processing system employing state of the art computer vision and image processing techniques. The Mosaicing System software consists of several modules for facilitating video and telemetry I/O, free and geo mosaic generation, report generation and pre / post processing operations.

The Mosaicing System supports loading of offline video / telemetry data. It also supports capturing data streamed by VFDPS (Video Flight Data Processing System) and using the captured data for subsequent operations. VFDPS is a flight simulator system that streams synchronized video and telemetry data. Video data is streamed over a video cable and telemetry data is streamed over the network. The Mosaicing System is capable of capturing data over these two systems and uses it for subsequent mosaic generation process.

KritiKal’s Solution

Porting Mosaicing System from Visual Studio 6.0 to Qt

An earlier version of the Mosaicing System was developed using Visual Studio 6.0, which is an obsolete technology. On the other hand, Qt is a cross platform user interface design toolkit

Eliminates perspective effect from free mosaics for better viewing of large mosaics

In order to handle general camera motion, it is required that the projection surface be dynamic. Manifold approach to mosaicing exploits this idea by projecting thin strips from the images onto manifolds which are dynamically determined by the camera motion. While the limitations of existing mosaicing techniques are a result of using predetermined manifolds, the use of dynamic manifolds overcomes these limitations. With manifold mosaicing it is possible to generate high-quality mosaicing even for very challenging cases of forward motion and of zoom.

Improve blending algorithm to reduce seams in mosaic

When different images are stitched together, the adjacent pixel intensities may differ enough to produce artifacts in the generated mosaic. While seams are perceptible in gray scale mosaics, artifacts in color mosaics are significantly more prominent. Blending algorithms remove these artifacts and generate mosaics with lesser number of seams.

Moving object segmentation with replay on mosaic

Automatically determines whether an object in the current view is moving or not. This object is tracked across several frames and corresponding motion vectors are maintained. These motion vectors replay the movement of the segmented object on the mosaic. The Mosaicing System automatically identifies and segments a moving object.

Intuitive Summarized Replay

User has the option to mark certain sections of input video to be displayed as the “summarized video”. Video sequences for which mosaic segments were successfully generated would be included in the “summarized video”. User has the control to delete certain sections (not deemed useful) from the “summarized video”.

Mosaicing

Features

  • Load offline video and telemetry data as inputs to the system
  • Accepts an xml configuration file as input. The XML configuration file format will be pre-defined and will contain information about the offline video and telemetry data
  • Accepts uncompressed video files as input and MPEG-4 compressed video files as input. It also accepts text files containing telemetry data in a pre-defined format as input
  • Provides a convenient user interface to view the input video data
  • Displays the telemetry data in numerical format
  • Allows the user to mark interesting sequence(s) in the input video sequence. Mosaic corresponding only to the interesting sequence(s) shall be generated
  • Allows setting of start and end telemetry record indices, in the input telemetry sequence, corresponding to interesting sequence(s) in the input video
  • Gives the option to generate either a planar mosaic or a dynamic manifold mosaic.
  • Supports multiple blending algorithms
  • Supports pre-processing operations like constant brightness enhancement to be applied to input video data before processing it for mosaic generation
  • Gives the option of multiple blending algorithms
  • Generates free mosaics using input video data and geo mosaics using input video and telemetry data
  • Intelligently recovers from a mosaic break in case of large distortions undergone by individual video frames or due to absence of sufficient features
  • Stops extending the current mosaic segment and starts creating a new one, in case of large distortions undergone by individual video frames or due to absence of sufficient features
  • Gives the option of “summarized replay” facility to display outcome of a mosaicing run in a concise manner
  • Stores the movement of tracked object in the form of motion vectors
  • Allows cross compilation on both GCC (Linux platform) and Visual Studio 2008 (Windows platform)

Image Enhancement & Change Detection Module

Day and Night Surveillance of a Large Area made Efficient

Overview

A north India based leading Defense establishment wanted an application software for day and night surveillance of a large area, using a high range panning camera which could support ranges as high as few kilometers. The software should detect, in real-time, any un-authorized movement and raise appropriate alarms, hence facilitating unmanned surveillance of sensitive areas, 24×7. This posed many challenges, important ones being the quality of input video, changing weather conditions etc. Hence it was mandatory to apply robust image enhancement techniques as a pre-processing step to ensure the reliable and efficient detection & tracking of intruders.

KritiKal Solutions designed and developed a PC based Image Enhancement and Change Detection Module (IECDM), which takes input from a live video feed from a continuously panning camera and detects and tracks unauthorized movement in IR, near IR or visual domains.

Challenges

  • Poor quality of input video due to variable and usually harsh weather conditions and long range
  • Detection of slow moving objects
  • Determining the direction of moving objects
  • Acquiring data in a systematic format i.e. useful data logging
  • Identification and classification of target object

KritiKal’s Solution

The PC based software designed and developed by KritiKal Solutions, creates mosaics from a live video feed coming from a continuously panning camera and detects the changes across different frames of the same scene. It tracks the detected objects, giving its accurate geographical location on the output video. Its in-built image enhancement techniques provide better quality output, enabling reliable and efficient, manned/unmanned surveillance of the sensitive areas. The software comes with a secure configuration mechanism, making it tamper proof.

FeaturesImage Enhancement

  • Day and night operation with range as high as 4-5Kms
  • Support for both Thermal and CCD cameras
  • Real-time seamless video mosaicking
  • Change detection, tracking and change trail display on mosaiced video, giving accurate geographical location of the intruding object
  • In-Built Image enhancement techniques to provide better quality output
  • Includes post-processing activities such as:
    • On demand zoomed view of a particular frame of the scene
    • Extensive activity logs for recording history of the intruder movement
  • Configurable options to set up grid references

Value Addition

  • Integrated solution for image mosaicing and change detection, enabling real time surveillance of a large area.
  • Support for Thermal and CCD cameras ensures day and night 24×7 operation.
  • Can be used in a surveillance vehicle or control room scenario
  • An ideal solution for Border Surveillance, Coastal Surveillance, and Homeland Security

Future Enhancements in Pipeline

  • Classification of the detected targets based on profile of its movement
  • Addition of Image fusion algorithm to fuse video inputs from CCD and Thermal cameras, making it possible to use both inputs to their advantage