public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Jim Garrison <jhg@jhmg.net>
To: cygwin@cygwin.com
Subject: Re: Trying to build OCRmyPDF under Cygwin, hit a brick wall
Date: Thu, 14 May 2020 15:50:44 -0700	[thread overview]
Message-ID: <2afedbcf-f3d7-4e96-e196-fb3091630245@jhmg.net> (raw)
In-Reply-To: <a5987f7c-1448-ce81-d40b-e3034e406eb8@jhmg.net>

The magic incantation necessary to get strdup turns out to be -
D_GNU_SOURCE, as noted on StackOverflow.

However, now I'm encountering a problem with Python's DLL handling
code.  When attempting to run OCRmyPDF I get



$ ocrmypdf --help
Traceback (most recent call last):
  File "/usr/bin/ocrmypdf", line 11, in <module>
    load_entry_point('ocrmypdf==9.8.0.post3+g5944044.d20200514',
'console_scripts', 'ocrmypdf')()
  File "/usr/lib/python3.7/site-packages/pkg_resources/__init__.py",
line 489, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python3.7/site-packages/pkg_resources/__init__.py",
line 2852, in load_entry_point
    return ep.load()
  File "/usr/lib/python3.7/site-packages/pkg_resources/__init__.py",
line 2443, in load
    return self.resolve()
  File "/usr/lib/python3.7/site-packages/pkg_resources/__init__.py",
line 2449, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File
"/usr/lib/python3.7/site-packages/ocrmypdf-9.8.0.post3+g5944044.d20200514-py3.7.egg/ocrmypdf/__init__.py",
line 18, in <module>
    from . import helpers, hocrtransform, leptonica, pdfa, pdfinfo
  File
"/usr/lib/python3.7/site-packages/ocrmypdf-9.8.0.post3+g5944044.d20200514-py3.7.egg/ocrmypdf/leptonica.py",
line 67, in <module>
    """
ocrmypdf.exceptions.MissingDependencyError:

---------------------------------------------------------------------
        This error normally occurs when ocrmypdf can't find the Leptonica
        library, which is usually installed with Tesseract OCR. It could
be that
        Tesseract is not installed properly, we can't find the installation
        on your system PATH environment variable.

        The library we are looking for is usually called:
            liblept-5.dll   (Windows)
            liblept*.dylib  (macOS)
            liblept*.so     (Linux/BSD)

        Please review our installation procedures to find a solution:
            https://ocrmypdf.readthedocs.io/en/latest/installation.html

---------------------------------------------------------------------


In the last file of the traceback (leptonica.py) there's this:


from ctypes.util import find_library
...
if os.name == 'nt':
    libname = 'liblept-5'
    os.environ['PATH'] = shim_paths_with_program_files()
else:
    libname = 'lept'


In Cygwin, that library is /usr/bin/cyglept-5.dll (why was the name
changed?)

First I created a symlink from cyglept-5.dll to liblept-5.dll, with no
effect. So I added a test for Cygwin at that point, resulting in this
code:


if os.name == 'nt':
    libname = 'liblept-5'
    os.environ['PATH'] = shim_paths_with_program_files()
elif sys.platform == 'cygwin':
    libname = 'cyglept-5'
else:
    libname = 'lept'


This also had no effect, so I tried playing with find_library() in the
interactive shell.  In Cygwin, it doesn't seem to find any DLLs even
though those DLLs are actually loadable.  Viz:


$ python3
Python 3.7.7 (default, Apr 10 2020, 07:59:19)
[GCC 9.3.0] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import sys
>>> from ctypes import *
>>> from ctypes.util import find_library
>>> find_library('cyglept-5') or 'Not found'
'Not found'
>>> find_library('cyglept-5.dll') or 'Not Found'
'Not Found'
>>> cdll.LoadLibrary('cyglept-5.dll') or 'Not Found'
<CDLL 'cyglept-5.dll', handle 3f7970000 at 0x6fffffea76d0>


So it appears to me that possibly find_library() is broken because
it doesn't find the library, but yet Python can actually load the
library.

What am I missing?


-- 
Jim Garrison jhg@acm.org

  reply	other threads:[~2020-05-14 22:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-14 19:45 Jim Garrison
2020-05-14 22:50 ` Jim Garrison [this message]
2020-05-15  0:23   ` René Berber
2020-05-15 18:17     ` Jim Garrison
2020-05-15  1:45   ` Marco Atzeri
2020-05-15 18:19     ` Jim Garrison
     [not found] <0d9b4a1b-05ba-8ab1-3783-c3d1f04f97b7@gmail.com>
2020-05-15 19:34 ` Marco Atzeri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2afedbcf-f3d7-4e96-e196-fb3091630245@jhmg.net \
    --to=jhg@jhmg.net \
    --cc=cygwin@cygwin.com \
    --cc=jhg@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).