Sudoku Analysis¶

There is a PDF version of this document available.

Quickstart

The simple documentation is also available.

Abstract

Contents

Abbreviations

BCF: see Blake Canonical Form
CNF: see Conjunctive normal form
DNF: see Disjunctive normal form
DPLL: see DPLL algorithm

Glossary

Adjacency matrix

As Wikipedia describes it[WPADJ]:

In graph theory and computer science, an adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph.

In the special case of a finite simple graph, the adjacency matrix is a (0,1)-matrix with zeros on its diagonal. If the graph is undirected (i.e. all of its edges are bidirectional), the adjacency matrix is symmetric.

Blake Canonical Form

As Wikipedia describes it[WPBCF]:

In Boolean logic, a formula for a Boolean function \(f\) is in Blake canonical form (BCF), also called the complete sum of prime implicants, the complete sum, or the disjunctive prime form, when it is a disjunction of all the prime implicants of \(f\). […]

The Blake canonical form is a special case of disjunctive normal form.

The Blake canonical form is not necessarily minimal, however all the terms of a minimal sum are contained in the Blake canonical form. […]

Selecting a minimal sum from a Blake canonical form amounts in general to solving the set cover problem, so is NP-hard.

Conjunctive normal form

As Wikipedia describes it[WPCNF]:

In Boolean algebra, a formula is in conjunctive normal form (CNF) or clausal normal form if it is a conjunction of one or more clauses, where a clause is a disjunction of literals; otherwise put, it is a product of sums or an AND of ORs.

A logical formula is considered to be in CNF if it is a conjunction of one or more disjunctions of one or more literals. As in disjunctive normal form (DNF), the only propositional operators in CNF are or (\({\displaystyle \vee }\)), and (\({\displaystyle \wedge }\)), and not (\({\displaystyle \neg }\)). The not operator can only be used as part of a literal, which means that it can only precede a propositional variable.

Disjunctive normal form

As Wikipedia describes it[WPDNF]:

In boolean logic, a disjunctive normal form (DNF) is a canonical normal form of a logical formula consisting of a disjunction of conjunctions; it can also be described as an OR of ANDs, a sum of products. […]

A logical formula is considered to be in DNF if it is a disjunction of one or more conjunctions of one or more literals. A DNF formula is in full disjunctive normal form if each of its variables appears exactly once in every conjunction and each conjunction appears at most once (up to the order of variables). As in conjunctive normal form (CNF), the only propositional operators in DNF are and (\({\displaystyle \wedge }\)), or (\({\displaystyle \vee }\)), and not (\({\displaystyle \neg }\)). The not operator can only be used as part of a literal, which means that it can only precede a propositional variable.

DPLL algorithm

This algorithm ist utterly irrelevant to the satoku matrix.

As Wikipedia describes it[WPDPLL]:

In logic and computer science, the Davis–Putnam–Logemann–Loveland (DPLL) algorithm is a complete, backtracking-based search algorithm for deciding the satisfiability of propositional logic formulae in conjunctive normal form, i.e. for solving the CNF-SAT problem.

In logic and computer science, the Davis–Putnam–Logemann–Loveland (DPLL) algorithm is a complete, backtracking-based search algorithm for deciding the satisfiability of propositional logic formulae in conjunctive normal form, i.e. for solving the CNF-SAT problem.

The basic backtracking algorithm runs by choosing a literal, assigning a truth value to it, simplifying the formula and then recursively checking if the simplified formula is satisfiable; if this is the case, the original formula is satisfiable; otherwise, the same recursive check is done assuming the opposite truth value. This is known as the splitting rule, as it splits the problem into two simpler sub-problems. The simplification step essentially removes all clauses that become true under the assignment from the formula, and all literals that become false from the remaining clauses.

The DPLL algorithm enhances over the backtracking algorithm by the eager use of the following rules at each step:

Unit propagation

If a clause is a unit clause, i.e. it contains only a single unassigned literal, this clause can only be satisfied by assigning the necessary value to make this literal true. Thus, no choice is necessary. Unit propagation consists in removing every clause containing a unit clause’s literal and in discarding the complement of a unit clause’s literal from every clause containing that complement. In practice, this often leads to deterministic cascades of units, thus avoiding a large part of the naive search space.

Pure literal elimination

If a propositional variable occurs with only one polarity in the formula, it is called pure. A pure literal can always be assigned in a way that makes all clauses containing it true. Thus, when it is assigned in such a way, these clauses do not constrain the search anymore, and can be deleted.

Unsatisfiability of a given partial assignment is detected if one clause becomes empty, i.e. if all its variables have been assigned in a way that makes the corresponding literals false. Satisfiability of the formula is detected either when all variables are assigned without generating the empty clause, or, in modern implementations, if all clauses are satisfied. Unsatisfiability of the complete formula can only be detected after exhaustive search.

Implicant

As Wikipedia describes it[WPIMP]:

In Boolean logic, the term implicant has either a generic or a particular meaning. In the generic use, it refers to the hypothesis of an implication (implicant). In the particular use, a product term (i.e., a conjunction of literals) \(P\) is an implicant of a Boolean function \(F\), denoted \({\displaystyle P\leq F}\), if \(P\) implies \(F\) (i.e., whenever \(P\) takes the value 1 so does \(F\)).

Implicate

An implicate is a sum term (i.e., a disjunction of literals) \(P\) which implies falsehood of a Boolean function \(F\). I.e., whenever \(P\) takes the value 0 so does \(F\).

References

[SHINY]

Shiny, A. K., & Pujari, A. K. (1999). An Efficient Algorithm to Generate Prime Implicants. Journal of Automated Reasoning, 22(2), 149–170. https://doi.org/10.1023/A:1005940031099.

In this paper, an efficient recursive algorithm is presented to compute the set of prime implicants of a propositional formula in conjunctive normal form (CNF). The propositional formula is represented as a (0,1)-matrix, and a set of 1’s across its columns are termed as paths. The algorithm finds the prime implicants as the prime paths in the matrix using the divide-and-conquer technique. The algorithm is based on the principle that the prime implicant of a formula is the concatenation of the prime implicants of two of its subformulae. The set of prime paths containing a specific literal and devoid of a literal are characterized. Based on this characterization, the formula is recursively divided into subformulae to employ the divide-and-conquer paradigm. The prime paths of the subformulae are then concatenated to obtain the prime paths of the formula. In this process, the number of subsumption operations is reduced. It is also shown that the earlier algorithm based on prime paths has some avoidable computations that the proposed algorithm avoids. Besides being more efficient, the proposed algorithm has the additional advantage of being suitable for the incremental method, without recomputing prime paths for the updated formula. The subsumption operation is one of the crucial operations for any such algorithms, and it is shown that the number of subsumption operation is reduced in the proposed algorithm. Experimental results are presented to substantiate that the proposed algorithm is more efficient than the existing algorithms.

[WPADJ]

Wikipedia contributors. (2025, March 28). Adjacency matrix. In Wikipedia, The Free Encyclopedia. Retrieved 20:27, April 11, 2025, from https://en.wikipedia.org/w/index.php?title=Adjacency_matrix&oldid=1282826257

[WPBCF]

Wikipedia contributors. (2025, March 23). Blake canonical form. In Wikipedia, The Free Encyclopedia. Retrieved 15:29, April 10, 2025, from https://en.wikipedia.org/w/index.php?title=Blake_canonical_form&oldid=1281935421

[WPCNF]

Wikipedia contributors. (2025, February 11). Conjunctive normal form. In Wikipedia, The Free Encyclopedia. Retrieved 20:27, April 10, 2025, from https://en.wikipedia.org/w/index.php?title=Conjunctive_normal_form&oldid=1275190406

[WPDNF]

Wikipedia contributors. (2025, April 4). Disjunctive normal form. In Wikipedia, The Free Encyclopedia. Retrieved 01:46, April 11, 2025, from https://en.wikipedia.org/w/index.php?title=Disjunctive_normal_form&oldid=1283900821

[WPDPLL]

Wikipedia contributors. (2025, February 21). DPLL algorithm. In Wikipedia, The Free Encyclopedia. Retrieved 04:30, April 12, 2025, from https://en.wikipedia.org/w/index.php?title=DPLL_algorithm&oldid=1276987229

[WPIMP]

Wikipedia contributors. (2025, January 14). Implicant. In Wikipedia, The Free Encyclopedia. Retrieved 14:49, April 10, 2025, from https://en.wikipedia.org/w/index.php?title=Implicant&oldid=1269337610

[CHENG201115]

Cheng, CK. (2011). Lecture 15: Karnaugh Maps II. In CSE20, Discrete Mathematics. University of California, San Diego. https://cseweb.ucsd.edu/classes/sp11/cse20-a/notes/lec15.ppt.

Objective of this course is to introduce discrete mathematics for computer system designs.

[deKleer1999]

Kleer, J. de. (1999). An Improved Incremental Algorithm for Generating Prime Implicates. In H. J. Levesque & F. Pirri (Eds.), Logical Foundations for Cognitive Agents: Contributions in Honor of Ray Reiter (pp. 103–112). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-60211-5_9.

In 1987 Ray Reiter and I wrote a paper entitled “Foundations of assumption-based truth maintenance systems: Preliminary report” which showed how the behavior of the Assumption-Based Truth Maintenance System can be defined using the notions of prime implicate and prime implicant. This definition of the ATMS immediately suggests generalizing the ATMS to operate on arbitrary clauses. This generalization raises two immediate computational challenges (to me, not Ray who seems immune to such challenges). First, computing prime implicates/implicants is very expensive. Second, since ATMS’s are used incrementally we need to exploit previous computation. This paper describes an improved and incremental algorithm to compute prime implicates/implicants. This algorithm allows us to experiment with the ideas Ray and I laid out in our paper. Unfortunately, the task is inherently NP-complete and all this paper can accomplish is present a more clever incremental algorithm.

[DELVAL1994551]

del Val, A. (1994). Tractable Databases: How to Make Propositional Unit Resolution Complete through Compilation. In J. Doyle, E. Sandewall, & P. Torasso (Eds.), Principles of Knowledge Representation and Reasoning (pp. 551–561). Morgan Kaufmann. https://doi.org/10.1016/B978-1-4832-1452-8.50146-9.

We present procedures to compile any propositional clausal database σ into a logically equivalent “compiled” database σ* such that. for any clause C, σ |= C if and only if there is a unit refutation of σ* U C. It follows that once the compilation process is complete any query about the logical consequences of σ can be correctly answered in time linear in the sum of the sizes of σ* and the query. The compiled database σ* is for all but one of the procedures a subset of the set PI(σ) of prime implicates of σ, but σ* can be exponentially smaller than PI(σ). Of independent interest, we prove the equivalence of unit-refutability with two restrictions of resolution, and provide a new sufficient condition for unit refutation completeness, thus identifying a new class of tractable theories, one which is of interest to abduction problems as well. Finally, we apply the results to the design of a complete LTMS.

[Echenim2017PrimeIG]

Echenim, M., Peltier, N., & Tourret, S. (2017). Prime Implicate Generation in Equational Logic. J. Artif. Intell. Res., 60, 827–880. https://api.semanticscholar.org/CorpusID:8314981.

We present an algorithm for the generation of prime implicates in equational logic, that is, of the most general consequences of formulæ containing equations and disequations between first-order terms. This algorithm is defined by a calculus that is proved to be correct and complete. We then focus on the case where the considered clause set is ground, i.e., contains no variables, and devise a specialized tree data structure that is designed to efficiently detect and delete redundant implicates. The corresponding algorithms are presented along with their termination and correctness proofs. Finally, an experimental evaluation of this prime implicate generation method is conducted in the ground case, including a comparison with state-of-the-art propositional and first-order prime implicate generation tools.

[KEAN1990185]

Kean, A., & Tsiknis, G. (1990). An incremental method for generating prime implicants/implicates. Journal of Symbolic Computation, 9(2), 185–206. https://doi.org/https://doi.org/10.1016/S0747-7171(08)80029-6.

Given the recent investigation of Clause Management Systems (CMSs) for ArtificialIntelligence applications, there is an urgent need for an efficient incremental method for generating prime implicants. Given a set of clauses F, a set of prime implicants II of F and a clause C1 the problem can be formulated as finding the set of prime implicants for II U C. Intuitively, the property of implicants, being prime implies that any effort to generate prime implicants from a set of prime implicants will not yield any new prime implicants but themselves. In this paper, we exploit the properties of prime implicants and propose an incremental method for generating prime implicants from a set of existing prime implicants plus a new clause. The correctness proof and complexity analysis of the incremental method are presented, and the intricacy of subsumptions in the incremental method is also examined. Additionally, the role of prime implicants in the CMS is also mentioned.

[MOSSE]

Mossé, M., Sha, H., & Tan, L.-Y. (2022). A Generalization of the Satisfiability Coding Lemma and Its Applications. In K. S. Meel & O. Strichman (Eds.), 25th International Conference on Theory and Applications of Satisfiability Testing (SAT 2022) (Vol. 236, pp. 9:1–9:18). Schloss Dagstuhl – Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.SAT.2022.9.

The seminal Satisfiability Coding Lemma of Paturi, Pudlák, and Zane is a coding scheme for satisfying assignments of k-CNF formulas. We generalize it to give a coding scheme for implicants and use this generalized scheme to establish new structural and algorithmic properties of prime implicants of k-CNF formulas. Our first application is a near-optimal bound of \(n \cdot 3^{n(1-Ω(1/k))}\) on the number of prime implicants of any n-variable k-CNF formula. This resolves an open problem from the Ph.D. thesis of Talebanfard, who proved such a bound for the special case of constant-read k-CNF formulas. Our proof is algorithmic in nature, yielding an algorithm for computing the set of all prime implicants - the Blake Canonical Form BCF - of a given k-CNF formula. The problem of computing the Blake Canonical Form BCF of a given function is a classic one, dating back to Quine, and our work gives the first non-trivial algorithm for k-CNF formulas.

[SCH2013PDE]

Scherer, W. (2013). Partial Distributive Expansion of CNF Problems. https://sw-amt.ws/satoku/doc/doc-partial-distributive-expansion/README.html.

Partial distributive expansion of CNF formulas by expanding disjunctive CNF clauses fo full DNF clauses for propagation of required conjunctive refinements.