How to find common values in a list

PanHo · June 9, 2024, 5:48pm

Hi,
I have a list
[[1,7],[2,3,4,6],[1,3,6],[5,7]]
how to get
[[1,7],[3,6],[3,6],[5,7]]
two digits 3 en 6 are common in the sublists. the goal is to find two commonalities. “two” can be parameterized to “three” eventually. So when “three” then 3 sublists should be found
its a sudoku solving strategy Hidden Candidates - SudokuWiki.org

onePythonUser · June 9, 2024, 6:45pm

Hi,

here is a play at a potential solution. However, this is more of a brute force type of workaround. It requires you to explicitly define the actual sub-list indexes ahead of time and then populating the original sub-list indexes with the common value sub-list results.

the_list = [[1, 7], [2, 3, 4, 6], [1, 3, 6], [5, 7]]

t = list(set(the_list[1]) & set(the_list[2]))
the_list[1], the_list[2] = t, t

print(the_list)

By the way, from your sample results list, it appears that the test only applies to sub lists that are adjacent to each other in the list and not to sub lists that are not. For example, the first and the fourth sub lists have 7 as a common value but are not modified as [7] as in:

[[7],[3,6],[3,6],[7]]

Can you clarify.

MRAB · June 9, 2024, 6:48pm

Count how many there are of each number and then remove any number that occurs only once.

chain from the itertools module and Counter from the collections module are functions that come to mind.

In fact, instead of removing the unwanted numbers, it’s better to build a new list of lists containing the wanted numbers.

PanHo · June 9, 2024, 8:22pm

hi paul,
Thank you for your response
the sublists can be but not must be adjacent.
the goal is to find two sublists who have two values in common.
And then in a second run : remove all otther values from found sublists

onePythonUser · June 9, 2024, 9:42pm

Okay, so let’s refine the requirements.

a. Two sublists must have ONLY two values in common? Is this the minimum requirement?
b. What if two sublists have more than two values in common? How about only one?
What if a sublist has two values in common with more than one sublist, i.e., multiple sublists?

Here is an example:

one_list = [ [2, 5, 6, 7], [6, 2, 3, 8], [5, 6], [1, 3, 8], [9, 5, 1], [2, 7, 5], [3, 2]  ]

What would be your expected result?

Try to be as thorough and clear as possible as the requirements will dictate the script design approach.

One more point: Is the number of sublists known beforehand? Does it vary at any given time?

NeilGirdhar · June 9, 2024, 10:40pm

This is a great problem, but I don’t think it was really well-explained. Let’s explore two equivalent formulations of your problem:

Given a list of cells that can all “see” each other, and each cell having possibilities given by:

c: list[set[int]]

The way you worded the problem: You would like to do a simplification whereby if a subset of cells C of size n contain between them the only occurrences of some set of digits I of size n, then those cell’s possibilities can be intersected by I.

I think it would be simpler to attack the problem: If some subset of cells C of size n contains between them a set of digits I of size n, then the rest of the cells \Omega - C can be intersected with \Omega - I.

NeilGirdhar · June 9, 2024, 11:32pm

A basic dynamic programming solution:

from itertools import combinations


cl = [[6, 9], [3, 7, 9], [2, 4], [8], [5], [1], [3, 6, 7, 9], [2, 6, 9], [4, 6, 9]]
c = [set(x) for x in cl]
d: dict[tuple[int, ...], set[int]] = {(i,): x
                                      for i, x in enumerate(c)
                                      if len(x) != 1}
unknown_cells = [i for i, c in enumerate(cl) if len(c) > 1]
n = len(cl)


for this_set_size in range(2, n):
    for index in combinations(unknown_cells, this_set_size):
        x = d[index[:-1]] | d[(index[-1],)]
        d[index] = x
        if len(x) == this_set_size:
            other_cells = set(range(n)) - set(index)
            found = False
            for i in other_cells:
                if x & c[i]:
                    found = True
                c[i] -= x
            if found:
                print(f"Found a naked tuple over cells {index} having possibilities {x}")


print(c)

onePythonUser · June 10, 2024, 12:14am

Hi,

is this the general solution for the original problem?

When entering the OPs original list into this script for testing, a very different solution is displayed (granted, the OP still needs to clarify additional requirement information).

[{1, 7}, {2, 3, 4, 6}, {1, 3, 6}, {5, 7}]

NeilGirdhar · June 10, 2024, 12:26am

The two problem formulations are only equivalent if you specify the entire set of cells that can see each other. I solved the equivalent (but simpler) formulation.

onePythonUser · June 10, 2024, 12:29am

Roger that, over. We’ll wait for the OP for additional details.

kknechtel · June 10, 2024, 6:53am

I think 7 is also common. Is that wrong? If so, why?

It’s necessary to have a clear understanding of a problem before it can be solved.

NeilGirdhar · June 10, 2024, 9:38am

@onePythonUser and @kknechtel Did you both click on the link? This question is how to find “naked tuples”. I gave a formal definition of the problem.

PanHo · June 10, 2024, 6:02pm

Hi Neil,
Your supplied code seems to work.
I have “listed” the first row from Hidden Candidates - SudokuWiki.org (first paragraph “hidden pairs”
[[1,2,5,8],[1,2,3,8],[2,3],[1,2,9],[1,2,3,5,9],[5,9],[4,5,8,9],[2,3,4,5,6,7,9],[3,4,5,6,7,9]]
and used this to feed your code
your code printed:
[{8, 1, 2, 5}, {8, 1, 2, 3}, {2, 3}, {1, 2, 9}, {1, 2, 3, 5, 9}, {9, 5}, {4}, {6, 7}, {6, 7}]
the {6,7} pair is correct
But to my surpise the {4} is also correct. This is an unique value (hidden single)for the complete row after the 4(among others) had been removed out of the {6,7} pair.
By the way and to nitpick you print two times naked tuple, should be hidden tuple and hidden single.
This is a pet project (Sudoku solver) for me to learn python. I do not know why your code works but i will find out!
Thank you so much

onePythonUser · June 10, 2024, 8:01pm

Awesome! It turned out be just what the OP wanted.