How to filter the latest version of every file in a tree using Ansible?

时光毁灭记忆、已成空白 提交于 2021-02-10 05:58:25

问题


I have a mid-sized directory-tree with variety of files:

  • /some/place/distfiles/foo-1.2.jar
  • /some/place/distfiles/subdir/foo-1.3.jar
  • /some/place/distfiles/bar-1.1.jar
  • /some/place/distfiles/bar-1.1.2.jar

I use the find-module to get the full list, but I only need the latest versions for each foo and bar. The above set, for example, needs to be reduced to:

  • /some/place/distfiles/subdir/foo-1.3.jar
  • /some/place/distfiles/bar-1.1.2.jar

No, I cannot rely on the files' timestamps -- only on the numerical parts of the filenames...

Can someone suggest an elegant way of doing it?


回答1:


It will be some non-trivial amount of jinja2, so you may be happier writing a custom module so you can work in an actual programming language (you can still use find: and register: to capture the list of filenames, and then give it to your custom module to roll up the "latest" for each file based on your rules)

That said, I would guess the version test can be used to find the newest after you have split the items into tuple[str, str] from its basename to its version number:

- find:
    paths:
    - /some/place/distfiles
    # etc
  register: my_jars
- set_fact:
    versions_by_filename: >-
       {%- set results = {} -%}
       {%- for f in my_jars.files -%}
       {%-   set bn = f.path | basename 
             | regex_replace(ver_regex, '\\1') -%}
       {%-   set v = f.path | regex_replace(ver_regex, '\\2') -%}
       {%-   set _ = results.setdefault(bn, []).append(v) -%}
       {%- endfor -%}
       {{ results }}
  vars:
    ver_regex: '(.*)-([0-9.-]+)\.jar'

- set_fact:
    most_recent: >-
       {%- set results = {} -%}
       {%- for fn, ver_list in versions_by_filename.items() -%}
       {%-   set tmp = namespace(latest=ver_list[0]) -%}
       {%-   for v in ver_list -%}
       {%-      if tmp.latest is version(v, '<') -%}
       {%-        set tmp.latest = v -%}
       {%-      endif -%}
       {%-   endfor -%}
       {%-   set _ = results.update({fn: tmp.latest}) -%}
       {%- endfor -%}
       {{ results }}



回答2:


Let's transform the data for this purpose first. For example

   - set_fact:
        my_files: "{{ result.files|json_query('[].path') }}"
    - debug:
        var: my_files

gives

  my_files:
  - /some/place/distfiles/foo-1.2.jar
  - /some/place/distfiles/bar-1.1.jar
  - /some/place/distfiles/bar-1.1.2.jar
  - /some/place/distfiles/subdir/foo-1.3.jar

Create a list of dictionaries

    - set_fact:
        my_dict: "{{ my_dict
                    |default([]) + [
                     dict(['path', 'archive', 'version']
                          |zip([item,
                                (item|basename).split('-')[0],
                                (item|basename).split('-')[1]|splitext|list|first])) ] }}"
      loop: "{{ my_files }}"
    - debug:
        var: my_dict

gives

  my_dict:
  - archive: foo
    path: /some/place/distfiles/foo-1.2.jar
    version: '1.2'
  - archive: bar
    path: /some/place/distfiles/bar-1.1.jar
    version: '1.1'
  - archive: bar
    path: /some/place/distfiles/bar-1.1.2.jar
    version: 1.1.2
  - archive: foo
    path: /some/place/distfiles/subdir/foo-1.3.jar
    version: '1.3'

Group the items by the name of the archive

    - set_fact:
        my_groups: "{{ my_dict|groupby('archive') }}"
    - debug:
        var: my_groups

gives

  my_groups:
  - - bar
    - - archive: bar
        path: /some/place/distfiles/bar-1.1.jar
        version: '1.1'
      - archive: bar
        path: /some/place/distfiles/bar-1.1.2.jar
        version: 1.1.2
  - - foo
    - - archive: foo
        path: /some/place/distfiles/foo-1.2.jar
        version: '1.2'
      - archive: foo
        path: /some/place/distfiles/subdir/foo-1.3.jar
        version: '1.3'

When the data is ready find the latest versions and print the results

    - debug:
        msg: "Latest version of {{ item.0 }} is {{ item.1|json_query(query)|first }}"
      vars:
        query: "[?version == '{{ latest_version }}'].path"
        latest_version: "{{ item.1|json_query('[].version')|max }}"
      loop: "{{ my_groups }}"

gives

  msg: Latest version of bar is /some/place/distfiles/bar-1.1.2.jar
  msg: Latest version of foo is /some/place/distfiles/subdir/foo-1.3.jar


Notes
  • Problem with filter max

Credit @mdaniel: "max suffers from the long-standing problem of trying to use a lexicographical sort for version numbers, as it claims that of the choices 1.2 and 1.10, that 1.2 is the 'latest'"

It's possible to create a custom filter plugin with the filter and use it to select the latest version. For example

$ cat filter_plugins/version_filters.py

from distutils.version import LooseVersion

def version_max(l):
    return sorted(l, key=LooseVersion)[-1]

class FilterModule(object):

    def filters(self):
        return {
            'version_max' : version_max
            }
latest_version: "{{ item.1|json_query('[].version')|version_max }}"
  • The play expects that the filenames have the format
<archive>-<version>.<extension>


来源:https://stackoverflow.com/questions/59691073/how-to-filter-the-latest-version-of-every-file-in-a-tree-using-ansible

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!