# The Problem
Sarracenia uses a lot of other packages to provide functionalâŚity. These are called *dependencies*. In it's native environment (Ubuntu Linux) most of these dependencies are easily resolved using the built-in debian packaging tools (apt-get.) but in many other environments, It is more complex. like: https://xkcd.com/1987/ Even in environments where dependencies are installed *somewhere* it is not always clear which ones are available to a given program.
On redhat-8, for example, there does not seem to be a wide variety of python packages available in operating system repositories. Rather the specific minimal packages needed for the OS's own needs of python are all that seem to be available. This makes it challenging to install on redhat, as one now has to package many dependencies as well as the main package. The typical approach is to hunt for individual dependencies in different third party repositories, or rebuild them from source... This is a bit haphazard, and in some cases, like watchdog or dateparser, the package itself has dependencies and one ends up having to create dozens of python packages.
On redhat, as in many other environments, it seems more practical to use python native packaging, rather than the incomplete OS ones, as they do dependency resolution, and all the dependencies can be brought in using pip. The result of this, if done system-wide, is a mix of Distro packages, and pip provided packages, which complicates auditing and patching. System Administrators may also object to the use of pip packages in the base operating system.
Windows is another example of an environment where pre-existing package availability is unclear. On windows, the natural distribution format would be a self-extracting EXE, but use of plugins with such a method is unclear, and all the dependencies need to be packaged within it. People also install python *distributions* ActiveState, Anaconda, or the more traditional cpython, and those will each have their own installation methods.
The complications mostly arise from dependencies such as xattr, python3-magic, watchdog, etc... that is packages that are wrappers around C libraries or use C libraries as part of their implementation. In these cases, pure python packaging often fails, as more environmental support is needed. For example, the python-magic python package requires the c-library libmagic1 to be installed. If using OS packages, this is just an additional dependency, no problem, but with pip, it will just fail, and the user needs to find the OS package, install that, and then try installing the python package again.
Another complication results from all these different platforms having methods of installation mean that it is not obvious what advice to provide to users when a dependency is missing "pip installe? conda install? apt install, yum install ?" ... the package naming conventions vary by distribution, and are different from the module names used to test their presence.
## Approaches to Dependency Management
### Manual Tailoring
For HPC (which runs redhat 8.x) there are a few dependencies brought in by EPEL packages, some built from source, but some had to be left out. The setup.py file, when building packages on redhat are typically hand edited to work around packages that are not available. So manual editing of packages is done. After the RPM is generated, it is then tested on another system, and a different user, to see whether it runs (as the local user doing the build may have pip packages which provide deps not available to others.)
implementation: manual editing of setup.py to remove dependencies.
### (Mostly) Silent Disable
Looking at xattr, the *import* is in a try/except, and if it fails, the storing of metadata in extended file attributes is disabled. There is a loss of functionality or a different behaviour on these systems as a result. There is no way to query the system for which *degrades* are active. nothing to prompt the user what to do to address, if they want to.
implementation in filemetadata.py:
```
try:
import xattr
supports_extended_attributes = True
except:
supports_extended_attributes = False
```
There are also tests in sarracenia/__init__.py for the code to degrade/understand when dependencies are missing:
```
extras = {
'amqp' : { 'modules_needed': [ 'amqp' ], 'present': False, 'lament' : 'will not be able to connect to rabbitmq broker
s' },
'appdirs' : { 'modules_needed': [ 'appdirs' ], 'present': False, 'lament' : 'will assume linux file placement under h
ome dir' },
'ftppoll' : { 'modules_needed': ['dateparser', 'pytz'], 'present': False, 'lament' : 'will not be able to poll with f
tp' },
'humanize' : { 'modules_needed': ['humanize' ], 'present': False, 'lament': 'humans will have to read larger, uglier
numbers' },
'mqtt' : { 'modules_needed': ['paho.mqtt.client'], 'present': False, 'lament': 'will not be able to connect to mqtt b
rokers' },
'filetypes' : { 'modules_needed': ['magic'], 'present': False, 'lament': 'will not be able to set content headers' },
'vip' : { 'modules_needed': ['netifaces'] , 'present': False, 'lament': 'will not be able to use the vip option for
high availability clustering' },
'watch' : { 'modules_needed': ['watchdog'] , 'present': False, 'lament': 'cannot watch directories' }
}
for x in extras:
extras[x]['present']=True
for y in extras[x]['modules_needed']:
try:
if importlib.util.find_spec( y ):
#logger.debug( f'found feature {y}, enabled')
pass
else:
logger.debug( f"extra feature {x} needs missing module {y}. Disabled" )
extras[x]['present']=False
except:
logger.debug( f"extra feature {x} needs missing module {y}. Disabled" )
extras[x]['present']=False
```
### Demotion to Extras
The Python Packaging tool has a concept of extras, sort of the inverse of *batteries included*... in setup.py one can put extras that are available with additional dependencies being installed:
```
extras = {
'amqp' : [ "amqp" ],
'filetypes': [ "python-magic" ],
'ftppoll' : ['dateparser' ],
'mqtt': [ 'paho.mqtt>=1.5.1' ],
'vip': [ 'netifaces' ],
'redis': [ 'redis' ]
}
extras['all'] = list(itertools.chain.from_iterable(extras.values()))
```
### Platform Dependent Deps
one can add dependencies that vary depending on the platform we are installing on.
```
install_requires=[
"appdirs", "humanfriendly", "humanize", "jsonpickle", "paramiko",
"psutil>=5.3.0", "watchdog",
'xattr ; sys_platform!="win32"', 'python-magic; sys_platform!="win32"',
'python-magic-bin; sys_platform=="win32"'
],
```
( this is in the [v03_issue721_platdep](https://github.com/MetPX/sarracenia/tree/v03_issue721_platdep) branch)
## What do we do?
So all of the approaches above (and perhaps others?) are used in the code, and someone using an installation will have a subset of functionality available, and sr3 has no way of reporting what is available or not. there is a branch https://github.com/MetPX/sarracenia/pull/738 that provides an example report of modules available using an *sr3 extras* command.
should we at least report what is working, and what isn't? An additional problem is that configured plugins may have additional dependencies. The mechanism in the pull request also provides a way for plugins to *register* those, so they show up in the inventory command.
Is this a reasonable/adviseable approach?