Author Archives: admin

One of the challenges when learning astronomy is identifying stars, constellations, planets in the sky. This is where a planetarium application steps in the help. With the gyroscopes built into all modern smartphones, the device knows which direction it is pointing in as you move around. This allows you to hold up the phone pointing at a nightsky and have the planetarium application display a chart of what you are pointing at. There are several applications available which offer this capability, with Google Skymap being a no frills, but effective, free option. If willing to pay for a planetarium app, then the well known Stellarium desktop app is also available as a paid for option.

Sunrise Sunset (Free with ads)

It can’t escape anyone’s attention that the Sun and Moon rise and set at a different time every day. The same is true for all the planets in the solar system that astronomers might want to view. It is thus useful to have a quick reference for when the rise and set times are for all the main bodies in the solar system. This provides details on the different stages of sunrise/sunset, and the rise and set times of each planet in a quick to access manner. A slight annoyance is that the app sometimes gets stuck and displays a blank page.

Eclipse 2.0 (Free with ads)

Eclipses, whether solar or lunar, are relatively rare events. A total solar eclipse happens a couple of times a year somewhere on the planet, but any given spot may have to wait decades between events. Lunar eclipses are a bit more frequent, but it might be a year or more between total lunar eclipses. This application will display a list of all future events (for a 100 years hence), and can optionally filter them based on the viewing location.

Solar Eclipse Timer (Paid per eclipse event)

When travelling to view of photograph a total solar eclipse, it is critical to know exact timings for the start and end of the eclipse, as well as the start & end of totality. These times vary depending on your precise physical location so an app is needed to provide localized timings. You don’t want to be fiddling with the phone at these times, so this app provides useful spoken audio alerts to each phase of the eclipse, allowing you to focus on more important things.

Clear Outside (Free)

For visual observers and astrophotographers, many an observing session will be ruined by poor weather conditions. This is particularly frustrating when certain events (eclipses, planetary conjunctions, comets) are only visible at very specific days and times. It is thus vital to both understand the current weather conditions and get predictions for forthcoming nights. There are a great many weather applications available, but general purpose forecasts don’t usually provide the level of detailed desired by astronomers. This application is designed specifically for astronomers, showing many fine details otherwise not available. It includes percentage cloud cover at three levels in the atmosphere, rain predictions, general visibility, wind speeds and more. It also details lunar phases since illumination from the moon can be as bad as city light pollution in drowning out fainter targets.

GPS Status (Free)

When calibrating / aligning telescope mounts it is often useful to know about the observing location’s longititude and latitude, as well as which direction is north. Phones all have builtin gyroscopes which can provide direction information, and GPS receivers to provide the location. The GPS status application presents this data in an easily accessible format and also provides the ability to recalibrate the internal compass which is worth doing periodically as some phones quite easily loose calibration.

ISS Detector (Free with ads, or paid without ads)

An event that always gets attention from attendees, particularly from first timers, at BSIA events is a flyover of the International Space Station. The ISS makes an orbit of the Earth approximately every 92 minutes, but it will follow a different track each time affecting where it appears in the sky, when viewed from a fixed point. The time relative to sunrise or sunset also affects how long it is visible for and its brightness magnitude. The “ISS Detector” application crunches the numbers and presents information on when passes are visible from the current observing location for the next 7 days or so. It will illustrate the path it will take across the sky with start/end times and magnitudes. It can be made to trigger a phone alert shortly before the flyover takes place to minimize the chance of accidentally forgetting when it happens. Despite the name it can provide track info for other objects besides the ISS.

ISS Transit (Free)

Some of the more interesting events for astrophotographers focusing on solar system objects are conjunctions and occultations. Flying in low earth orbit, the ISS has quite frequent alignments with other solar system objects, especially the moon and sun. These events are interesting to photograph allowing the ISS to be seen in silhouette against the Sun, or brightly illuminated against the Moon. Whether these events are visible is highly dependant on the exact viewing position on the ground and thus require advance planning and travel to quite specific sites. This application is able to perform calculations to show where forthcoming interesting transits are taking place.

Heavens Above (Free with ads)

The earlier mentioned ISS detector application focuses on fly overs of the ISS. There are many more satellites in orbit around the Earth that while often not visible to the naked eye, can appear in photographs either intentionally or unintentionally. This application provides information on all the satellites that are likely to be overhead for any given night.

Dark Sky Map (Free)

With an ever increasing portion of the population living in towns and cities, light pollution is a big problem to deal with. Even though the switch to LED lighting can have the potential to reduce light pollution by having strictly downwards facing illumination, in practice there’s been light obvious benefit as the lights are often brighter. Whether deciding on where is best to live, or planning day excursions or extended holidays, it is useful to understand the extent of light pollution across the country / world. The dark sky map provides this information in an easily understandable format overlaying a map with colour coded bands for light pollution.

Aurora Watch (Free, no ads)

Aurora, often known as “Northern Lights” form when energetic particles from the Sun interact with the Earth’s atmosphere and magnetic field. As the name suggests, they occur more frequently the further north you or (or further south in the Southern hemisphere). There is a high correlation between the flux of particles received from the Sun and the visibility of Aurora. The application provides aurora visibility forecasts for the days ahead and active alerts when high thresholds are exceeded.

Polaris View (Free, no ads)

When setting up an equatorial tracking mount care needs to be taken to accurately align the mount with the North Star, Polaris. It is not as simple as just putting Polaris in the dead center, as it is not quite at true north in the sky. Instead it needs to be offset by a small amount. The direction in which it is offset depends on the time of day / night, and the amount of the offset varies from year to year as the Earth wobbles on its axis. Mounts will typically have a polar scope / sight to facilitate this alignment process, but may require knowledge of the direction in which Polaris should be offset. The application provides a simple view of the sky simulating what it should look like through a polar scope. The astronomer simply needs to align their scope so that the position of Polaris matches that shown in the app.

Night Shift (Free, no ads)

This application is a new & promising discovery, aiming to provide a summary of everything that is interesting in the night sky for any given night. It displays visibility of all the solar system planets and the Moon. The viewer can enter information about their telescopes and viewing location and it can then highlight which other constellations, nebula, and deep sky objects are likely to be possible to view. It effectively combines functions of many of the previously discussed applications into one convenient dashboard.

Sun Position Map (Free, no ads)

When planning photographic shots featuring the sun or moon in combination with foreground objects, it can be useful to understand exactly where the sun will be in relation to the foreground. This application provides an overlay for Google maps, showing the path of the sun from rise to set. Just find the object to be photographed on the map, and the app will show where the sun will be in relation to it. It can also overlay a field of view indicator for various camera lens focal lengths.

Digitizing books with a camera and open source software

1 Reply

Last year I had need to help out with digitizing an old book, so that its current content can be updated, expanded and ultimately reprinted. The current copy was written directly on a typewriter and ran to almost 250 pages of very dense text. Transcribing that much content manually was not an enticing prospect, so I started investigating the options for automation. I quickly found the open source Tesseract OCR software which runs on Linux, Windows and OS-X. This dates back into the mid-80’s and was open sourced by HP in 2005. Tesseract just focuses on the core OCR tasks, and leaves image acquisition to other tools; likewise post-recognition processing.

Reading about how it works it becomes clear that the biggest factor in accuracy is the quality of the input images. It converts the input image to monochrome, approximately speaking, by applying a threshold algorithm to the image. For this to work effectively the image has to be evenly illuminated; any kind of gradient across the background will confound the monochrome conversion leading to large blocks of text getting lost.

A flatbed scanner is not a satisfactory way of capturing the pages of the book because it is impossible to get the pages flat without damaging the spine. Instead a digital SLR camera is the preferred tool, with flashgun(s) to provide illumination. Even when using a camera, the spine of the book is still a problem as if you simply open the book on a flat surface the pages will curve near the spine leading to uneven illumination & distortion of the text which will ruin OCR accuracy. The solution to this is to construct some kind of book scanning rig that will support the book such that it opens to an angle somewhere in the region of 110-140 degrees. This might sound like overkill for a single book, but it is well worth the effort.

The simple book scanning rig, constructed from MDF, large enough to hold a book upto approximately 20x30cm in size.

The rig is very simple to construct requiring little more than a sheet of MDF, wood screws, a jigsaw, drill and screwdriver. The exact dimensions are not important, this one was sized to fit the book that was being digitized using off-cuts of MDF leftover from a previous job. Two rectangular pieces of MDF were used to form the long sides of the rig, and v-shape cut out from them to form an angle approximately 120 degrees . Two more pieces of MDF were cut to form the short ends of the rig. The four peices are screwed together to give the basic box form. A final two pieces MDF are cut to cover over the v-shape depression, and screwed to the sides. The screws were all countersunk and covered with wood filler. The final task was giving the thing a coat of white paint. The construction didn’t take more than an hour and a few hours between two coats of paint. It was allowed to fully dry overnight.

The v-shaped depression cradles the spine of the book, but the pages are still a little curved. The trick to dealing with this is to place a sheet of glass on top of the page. The glass needs to be thick enough to have sufficient weight to hold the pages completely flat. A salvaged window pane in the workshop happened to be the right size and suitable weight. If buying glass new it is preferable to have the edges rounded off smooth. The salvaged glass had rough edges so some red electrical tape was stuck over the edges to protect the book. Bear in mind that the tape must not cover any text that needs scanning – generally there are sufficient margins in books that this won’t be a problem.

Book on the scanning rig with glass weight holding the page flat

With construction of the rig to support the book complete, it was time to begin the image capture process. To avoid distortion of the text, the camera needs to be perfectly aligned such that the lens is perpendicular to the page. Needless to the say, the camera needs to be placed on a tripod and positioned over the pages of the book. The parallel pencil lines drawn on the side of the book scanner rig are there to assist in aligning the camera. The OCR process will work best with images that have a resolution of at least 300 DPI. To get near this kind of resolution, the camera lens needs to be chosen to ensure the page will fill the image. Zoom lens in particular suffer from geometric distortion at their extremes of focal length and aperture. Picking a middling aperture in the region of f/8 will minimize the distortion & thus improve OCR accuracy.

To avoid a gradient across the background of the page the lighting setup during image capture is important. It is unlikely that either normal room lighting or natural lighting from windows will give even illumination. It is better to take full control by using camera flash guns. Ideally a pair of flash guns would be positioned either side of the book, their combined beams giving the desired result. I only had a single flash gun available, but the room was blessed with large white walls and ceilings. The flash gun was thus pointed at an angle towards the wall on a wide spread. This required a high power level on the flash, but the resulting reflected light gave excellent results.

It is possible to do everything in camera, but since I wrote the Entangle Photo software for controlling digital SLRs, I naturally used a laptop to control the camera. This allowed the images to be reviewed on a large display on the fly. It was now simply a matter of running through the book, turning the page and pressing the shutter button ~250 times. It was quicker to capture all the odd pages first, and then flip the book around and capture all the even pages. If capturing lots of books a second camera would be desirable allowing odd and even pages to be captured at the same time.

Camera mounted on tripod above the book scanning rig. Flash gun is pointed at a white wall for reflected light. The laptop controls the camera and displays captured images.

With this rig in operation it was possible to easily capture about 6-8 pages a minute, allowing the whole book to be captured in less than 40 minutes. Due to the positioning of the camera rig, the pages were all perfectly square wrt the image, but it is still necessary to crop the images to eliminate borders which can confuse the OCR process. The Darktable application makes it easy to process large numbers of raw images, cropping them all to the same extents.

The cropped images can now be through the Tesseract command line program to perform the OCR process. The results, for the most part, were really very impressive in terms of accuracy. Where it had problems in particular were with some pages typed on very thin paper, such that text from the reverse would bleed through. Those pages had to be thrown away and transcribed by hand, but this was only 6 pages out of 250. It uses a language dictionary to analyse the recognised text and resolve ambiguity. This can lead to similar looking words being substituted and it notably falls down on place and person names which are largely absent from the dictionary.

The results of OCR on each image were saved to a separate plain text file. These were loaded up in gedit and the input images loaded up in gthumb. Their respective windows were tiled such that they filled the screen side-by-side. gedit annoyingly doesn’t have a way to turn on the spell checker by default, but a one line change in the source code & a rebuild was sufficient to fix this. Each text file had to be read through and the highlighted spelling mistakes all corrected. This was by far the most time consuming part given the large amount of text per page, taking 1-3 minutes per page depending on the quantity of corrections needed, adding up to many hours work in total across the full 250 pages.

The OCR process aims to preserve the page layout so will put hard line breaks after every line of text. The new digital edition of the book is expected to be printed at a different size than the original, so the layout was inevitably going to change. Thus a second pass through every page is made to remove the hard line breaks, leaving just the paragraph breaks.

Each plain text file still represents a single page, so they are now concatenated with form feed character inserted between each file’s contents. The digitizing of the book is essentially now complete and the result ready to be loaded into LibreOffice for the interesting editing work to start. It is hard to accurately say how long the whole digitization process took, since the effort was spread, sporadically, a little at a time over a large number of weeks. It was, however, definitely faster and less tedious than transcribing the entire text of the book manually. With good quality input images, and the right language dictionary, the accuracy of Tesseract OCR is very impressive and well worth using.

f/138 – Daniel Berrangé

Photography.Art.Astronomy.Science