# Symmetry in Boolean Satisfiability

## Transcript Of Symmetry in Boolean Satisfiability

Symmetry 2010, 2, 1121-1134; doi:10.3390/sym2021121 Review

OPEN ACCESS

symmetry

ISSN 2073-8994 www.mdpi.com/journal/symmetry

Symmetry in Boolean Satisfiability

Fadi A. Aloul

American University of Sharjah, P.O. Box 26666, Sharjah, United Arab Emirates; E-Mail: [email protected]

Received: 31 December 2009; in revised form: 14 May 2010 / Accepted: 9 June 2010 / Published: 11 June 2010

Abstract: This paper reviews recent approaches on how to accelerate Boolean Satisfiability (SAT) search by exploiting symmetries in the problem space. SAT search algorithms traverse an exponentially large search space looking for an assignment that satisfies a set of constraints. The presence of symmetries in the search space induces equivalence classes on the set of truth assignments. The goal is to use symmetries to avoid traversing all assignments by constraining the search to visit a few representative assignments in each equivalence class. This can lead to a significant reduction in search runtime without affecting the completeness of the search.

Keywords: boolean satisfiability; symmetries; search

1. Introduction

The last few years have seen a remarkable growth in the use of Boolean Satisfiability (SAT) models and algorithms for solving various problems in Engineering and Computer Science. This is mainly due to the fact that SAT algorithms have seen tremendous improvements in the last few years, allowing larger problem instances to be solved in different application domains. Such applications include formal verification [1], FPGA routing [2], power estimation [3], fault tolerance [4], network assignments [5], wireless communications [6], and scheduling [7]. SAT has also been extended to a variety of applications in Artificial Intelligence, including other well-known NP-complete problems.

SAT solvers have traditionally been used to solve decision problems. Given a set of Boolean variables and constraints expressed in products-of-sum form (also known as conjunctive normal form (CNF)), the goal is to identify a variable assignment that will satisfy all constraints in the problem or prove that no such assignment exists. It is well known that the SAT problem is NP-complete [8] and that any algorithmic approach for solving it will require at least worst-case exponential time in its size.

Symmetry 2010, 2

1122

Most powerful SAT solvers today are search based and use intelligent techniques to explore new regions of the search space while looking for a satisfying assignment. These intelligent techniques helped extend the application of SAT solvers to large problem instances consisting of thousands of variables and millions of constraints. Despite these advances, the tremendous growth in today’s designs is continuously outpacing the capabilities of existing SAT solvers. Many SAT instances remain hard to solve due to the significant size and complexity of the underlying systems they represent.

Prasad et al. observed that Boolean functions arising in Engineering applications are “structured” [9]. Unlike random problems, structured problems have non-uniform distributions and consist of clusters of variables. In this paper, we summarize the main findings in exploiting and analyzing the SAT instance’s structure to improve the search process. Specifically, we review an approach to accelerate SAT search by exploiting symmetries in the instance. The approach consists of an automated flow for (1) finding all syntactic symmetries of CNF formulas and (2) utilizing the symmetries to add symmetry-breaking predicates that effectively prune the search space and accelerate SAT solvers. Presented experimental results show the advantage of symmetry breaking for search algorithms.

The remainder of this paper is organized as follows. Section 2 presents an overview of Boolean Satisfiability. Section 3 describes basic symmetry definitions and notations. In Section 4, graph automorphism, which is used to identify symmetries, is described. Symmetry detection and breaking are discussed in Sections 5 and 6, respectively. Experimental results are presented in Section 7. Finally, the paper is concluded in Section 8.

2. Boolean Satisfiability

Recent years have seen significant advances in Boolean Satisfiability (SAT) solving. These advances have led to the successful deployment of SAT solvers in a wide range of problems in Engineering and Computer Science. The SAT problem involves finding an assignment to a set of binary variables that satisfies a given set of constraints or proving that no such assignment exists. In general, these constraints are expressed in products-of-sum form, also known as conjunctive normal form (CNF). SAT constraints will be referred to as clauses in the paper. A CNF formula, ϕ on n binary variables, x1,..., xn , consists of the conjunction (AND) of m clauses, ω1,...,ωm , each of which consists of the disjunction (OR) of literals. A literal is an occurrence of a Boolean variable or its complement.

As an example, the CNF instance:

f (a,b, c) = (a ∨ b) ⋅ (b ∨ c)

(1)

consists of three variables, two clauses, and four literals. The assignment {a = 1, b = 0, c = 0} leads to a conflict, whereas the assignment {a = 1, b = 0, c = 1} satisfies f. Note that a problem with n variables will have 2n possible assignments to test. The above example with three variables has eight possible assignments. An instance with 100 variables will have 1.27e + 30 assignments. Assuming a processor that can verify an assignment every one nanosecond, the processor will complete testing all 2100 assignments in 4e + 12 years.

Symmetry 2010, 2

1123

Despite the SAT problem being NP-Complete [8], there have been dramatic improvements in SAT solver technology over the past decade. This has led to the development of several powerful SAT algorithms that are capable of solving problems consisting of thousands of variables and millions of constraints. Such solvers include Grasp [10], zChaff [11], Berkmin [12], MiniSAT [13].

Most powerful SAT solvers are based on the original Davis-Putnam-Logemann-Loveland (DPLL) backtrack search algorithm [14]. The algorithm performs a depth first search process that traverses the space of 2n variable assignments until a satisfying assignment is found (the formula is satisfiable), or all combinations have been exhausted (the formula is unsatisfiable). The search process proceeds as follows. Originally, all variables are unassigned. The algorithm begins by choosing a decision assignment to an unassigned variable. A decision tree is maintained to keep track of variable assignments. An example of a decision tree is shown in Figure 1. After each decision, the algorithm determines the implications of the assignment on other variables. This is obtained by forcing the assignment of the variable representing an unassigned literal in an unresolved clause, whose all other literals are assigned to 0, to satisfy the clause. This is referred to as the unit clause rule. If no conflict is detected, the algorithm makes a new decision on a new unassigned variable. Otherwise, the backtracking process un-assigns one or more recently assigned variables and the search continues in another area of the search space.

Figure 1. An example of a satisfiable SAT instance showing its corresponding decision tree.

SAT solvers have been extended with several powerful algorithms to further expedite the search process. One of the best algorithms is known as the conflict analysis procedure [10] and has been implemented in almost all SAT solvers. Whenever a conflict is detected, the procedure identifies the causes of the conflict and augments the clause database with additional clauses, known as conflict-induced clauses, to avoid regenerating the same conflict in future parts of the search process. In essence, the procedure performs a form of learning from the encountered conflicts. Significant

Symmetry 2010, 2

1124

speedups have been achieved with the addition of conflict-induced clauses, as they tend to effectively prune the search space.

Intelligent decision heuristics and random restarts [11] also played an important role in enhancing the SAT solvers performance. The developers of the state-of-the-art SAT solver, zChaff [11], proposed an effective decision heuristic, known as VSIDS, and implemented several other enhancements, including random restarts, which lead to dramatic performance gains on many CNF instances.

Over the years, SAT has been directly applied to various electronic design automation (EDA) computational tasks. In general, the overhead of reducing many of these problems to SAT is small. Below, we briefly survey a number of EDA applications to which SAT has been successfully introduced.

One of the important applications of SAT is circuit verification. Due to the increasing complexity of modern hardware designs, verifying the correctness of these designs is becoming increasingly difficult. SAT has been successfully applied to formal verification, which has attracted much interest from industry, since unlike simulation, which only reveals the presence of bugs, formal verification can prove their absence. Specifically, combinational equivalence checking has been solved using SAT [15], and later, several attempts have been made to solve sequential equivalence checking problems using SAT [16]. Additionally, SAT has been used for bounded model checking [1], microprocessor verification [17], and functional vector generation [18]. Besides verification, SAT has also been heavily applied in automatic test pattern generation (ATPG) [19] and extended to delay fault testing [20]. SAT has also been used in FPGA routing [2], global routing [21], logic synthesis [22], cross talk noise analysis [23], power estimation [3], and power leakage estimation [24].

Additionally, SAT has been extended to a variety of applications in Artificial Intelligence, including other well known NP-complete problems such as graph colorability, vertex cover, hamiltonian path, and independent sets [25].

3. Symmetry Definitions and Notations

The symmetry of a discrete object is a permutation of its components that leaves the object intact. For example, the rotations of a spatial solid, e.g., a cylinder, that leave its shape unchanged or the negation of a Boolean variable in a Boolean formula, e.g., x in (x + y) ⋅ (x + y) ⋅ ( y + z) , that does not affect the formula or the function it represents.

Permutational symmetries can be classified as either syntactic or semantic symmetries. Given a Boolean function, a semantic symmetry is a permutation of variables that does not change the value of the function under any variable assignment. On the other hand, a syntactic symmetry, also known as a structural symmetry, is a permutation that does not change the representation of the function. Hence, a syntactic symmetry is also a semantic symmetry. In this paper, we focus on using syntactic symmetries to improve SAT search.

In this paper, we will use the cycle notation to express the permutations. For example, a permutation that swaps elements 2 and 3, and maps 4 to 5, 5 to 6, and 6 to 4, while mapping all other elements to themselves will be represented as ϕ = (23)(456) .

In general, the number of permutations in a permutation group can be exponentially large. Rather than explicitly representing the set of permutations in a group, they are typically represented implicitly

Symmetry 2010, 2

1125

using irredundant sets of generators. A set of generators is a set of group elements such that any other group element can be obtained from the products of the generators.

As an illustration, consider the permutations ϕ1 = (12) and ϕ2 = (23) expressed in cycle notation. Taking the composition, i.e., product, of ϕ1 ⋅ϕ2 generates the permutation ϕ3 = (123) , which is the result of performing ϕ2 followed by ϕ1.

A set of generators is irredundant if it is not possible to express any of its elements as a product of other elements. An interesting observation from group theory is that an irredundant set of generators for a group with N > 1 elements contains at most log2 N elements [26]. For example, the group of all permutations on k elements has k! permutations, but can be efficiently generated by only two generators: ϕ1 = (12) and ϕ2 = (12…k) . Thus, the use of irredundant sets of generators to express the complete set of permutations ensures exponential compression.

In this paper, we will assume a total ordering x1 < x2 < …< xn of the variables x1, x2 ,.…, xn and consider the induced lexicographic ordering of the 2n truth assignments, i.e., 0–1 strings of length n. Given an equivalence partition on these assignments, the lex-leader of a particular equivalence class, i.e., orbit, is defined as its lexicographically smallest element. Adding a lex-leader predicate ensures that a Boolean function evaluates to true only on lex-leaders of orbits.

For example, consider a Boolean function with four variables. Assume an orbit consists of the assignments {0100, 1000, 0010, 0001}. The lex-leader of the given orbit is {0001}. Adding a lexleader predicate will falsify the first three assignments.

In terms of a CNF formula, a permutation of literals is a symmetry of a given CNF formula if Boolean consistency is observed and the formula is preserved under the permutation. In other words, every clause must map into a clause with the same polarities of literals. The addition of a symmetrybreaking predicate (SBP), expressed as a set of CNF clauses, allows only one of many equivalent variable assignments to be a potential solution to the formula. An SBP is a full SBP if it selects exactly a single element from each orbit; otherwise it is known as a partial SBP. A lex-leader SBP (LL-SBP) is an SBP that selects lex-leaders only and is also classified as a full SBP. It is essential for partial SBPs, however, to select lex-leaders among other elements. Such SBPs are known as partial lex-leader SBPs (PLL-SBPs). Note that the addition of an SBP to the original CNF formula does not affect its satisfiability, but restricts the possible solutions to those selected by the SBP. In other words, if the original formula is satisfiable, the number of solutions may decrease considerably after pre-processing. However, if the original instance is unsatisfiable, the “number of equivalent paths leading nowhere” will be reduced and the SAT solver is likely to conclude faster that no solution exists.

As an example, assume the CNF formula:

f (a,b, c, d, e, f , g) = (c ∨ e ∨ f )(a ∨ c)(b ∨ c)(d ∨ e)(d ∨ f )

(2)

The group of all permutations of the CNF formula consists of four permutations {E, (ab), (dd )(ef ), (ab)(dd )(ef )} that can be efficiently obtained from the products of the two generators ϕ1 = (ab) and ϕ2 = (dd )(ef ) . Note that E is the identity symmetry. Permuting the literals of the formula according to all four permutations yields the same formula. This is shown in Table 1.

Symmetry 2010, 2

1126

Table 1. Permuting the literals of the CNF formula in (2) according to all its symmetries.

Permutation

E

(ab) (dd )(ef ) (ab)(dd )(ef )

CNF Formula (c ∨ e ∨ f ) ⋅ (a ∨ c) ⋅ (b ∨ c) ⋅ (d ∨ e) ⋅ (d ∨ f ) (c ∨ e ∨ f ) ⋅ (b ∨ c) ⋅ (a ∨ c) ⋅ (d ∨ e) ⋅ (d ∨ f ) (c ∨ f ∨ e) ⋅ (a ∨ c) ⋅ (b ∨ c) ⋅ (d ∨ f ) ⋅ (d ∨ e) (c ∨ f ∨ e) ⋅ (b ∨ c) ⋅ (a ∨ c) ⋅ (d ∨ f ) ⋅ (d ∨ e)

4. Graph Automorphism

The problem of identifying all symmetries is known as the graph automorphism problem [27]. Given a graph, a symmetry, i.e., an automorphism, is a permutation of its vertices that maps edges to edges. In the case of directed graphs, edge orientations must be maintained. In terms of complexity, there are no known worst-case polynomial-time algorithms for solving the graph automorphism problem. Nevertheless, the problem is rarely difficult in practice and is not believed to be NP-complete nor to be in P. Algorithms typically finish in linear-time if the symmetry is trivial, and in polynomialtime for graphs of bounded vertex degree [28].

The complexity of solving the graph automorphism problem can be further simplified by the use of vertex labels. Each vertex is labeled by a color or an integer and the goal of the problem is reduced to mapping vertices to vertices of the same label.

In this paper, we will use the powerful graph automorphism tool, Saucy [29].

5. Detecting Symmetries in CNF Formulas

In this section, we describe how to identify symmetries in CNF formulas using graph automorphism [27]. The main idea is to express a CNF formula as an undirected graph such that the symmetry group of the graph is isomorphic to the symmetry group of the CNF formula. The first attempt to identify such symmetries by modeling CNF formulas as graphs was proposed in [30], and subsequently refined by Crawford et al. [31]. The construction proceeds as follows. Assuming a CNF formula with V variables and C clauses, of which C2 are binary clauses and Cx are clauses of size three or more (single literal clauses are removed by preprocessing the CNF formula), a graph is constructed as follows:

• A single vertex is created for each clause in Cx . • Two vertices are created for each variable, representing its positive and negative literals. • Edges are added connecting a clause vertex to its respective literal vertices.

Vertices corresponding to clauses, positive literals, and negative literals are colored differently. Boolean consistency is ensured by adding an extra vertex for each variable and two edges connecting the new vertex to the literals of that variable. The new set of vertices is assigned a new label. Clauses in C2 are represented by adding a single edge between the two literals of the clause. This construction yields a total of 3V + Cx vertices with four unique colors of vertices. An example of the construction is shown in Figure 2(a). Note that the presented construction cannot detect phase shift symmetries, i.e., symmetries of the form x → x .

Symmetry 2010, 2

1127

The construction in [31] was further improved in [32] to handle phase-shift symmetries and their compositions with permutational symmetries. The construction in [32] also produces a smaller graph compared to previous methods. Assuming a CNF formula with V variables and C clauses, of which C2 are binary clauses and Cx are clauses of size three or more (single literal clauses are removed by preprocessing the CNF formula), a graph is constructed as follows:

• A single vertex is created for each clause in Cx . • Two vertices are created for each variable, representing its positive and negative literals. • Edges are added connecting a clause vertex to its respective literal vertices.

Vertices corresponding to clauses and variables are colored differently. Boolean consistency is ensured by adding an edge between vertices of opposite literals. Clauses in C2 are represented by adding a single edge between the two literals of the clause. The construction yields a total of 2V + Cx vertices with two unique colors of vertices. An example of the construction is shown in Figure 2(b).

Figure 2. Conversion of the CNF formula (x ∨ y ∨ z) ⋅ (x ∨ y ∨ z) ⋅ ( y ∨ z) to a graph for symmetry extraction purposes using the constructions (a) in [31] and (b) in [32]. Different shapes correspond to different vertex colors.

6. Breaking Symmetries in CNF Formulas

Once the symmetries are identified, the next step is to break them in the CNF formula. This is accomplished by adding symmetry breaking predicates (SBPs) that choose lex-leaders (LL), i.e., lexicographically smallest assignments, in each orbit or equivalence class.

Crawford et al. [31] laid the theoretical foundation for constructing SBPs for CNF formulas that possess permutational symmetries. Their construction assumes a given variable ordering and consists of two nested conjunctions:

• An outer conjunction over all permutations in the group of symmetries. • An inner conjunction over all variables in the permutation.

This results in the selection of just the lex-leaders from each orbit and breaks all symmetries. Given a group of symmetries ∏ = {π1,…,π m} for a CNF formula defined over a set of totally-ordered variables x1 < x2 < … < xn , the LL-SBP is defined as follows:

Symmetry 2010, 2

1128

PP(π ) = ∩ [ ∩(x j = xπj )] → (xi ≤ xiπ )

(3)

1≤i ≤n i ≤ j ≤i −1

LL − SBP(∏) = ∩ PP(xi )

(4)

1≤i ≤m

We will refer to PP(π ) as the permutation predicate for permutation π . By introducing n “equality” variables, e j ≡ (x j = xπj ) , each PP can be translated to a CNF formula with 5n clauses and 0.5n2 + 13.5n literals. This is derived as follows. For each PP, we introduce variables e1,…, en . For each e j variable, we add four clauses, each consisting of three literals, in order to define the equality relationship with (x j = xπj ) . Furthermore, the outer conjunction over all n variables in (3) produces n

clauses of increasing sizes; the first, second, and n-th clause have a size of two, three, and n + 1 literals, respectively. Hence, all n clauses have a total of (n2 + 3n) / 2 literals. An example of this

construction is shown in Table 2.

Table 2. Example showing the creation of SBPs for the single-cycle permutation (abcd) using the formula in (3).

SBP

Clauses

This approach breaks all permutational symmetries and will be referred to as full symmetry breaking. The downside is that the number of symmetries in the symmetry group is usually exponential in the number of problem variables, making such full symmetry breaking impractical. This is addressed in [31] by building a symmetry tree and pruning it to remove unnecessary duplication. This, unfortunately, still does not preclude the need to consider an exponential number of permutations in order to break all symmetries.

For symmetry breaking to be effective in practice, the computational overhead of generating and manipulating the SBPs must be significantly less than the runtime savings they yield due to search pruning. Since breaking all symmetries can lead to an exponentially large SBP, which will significantly slow down the SAT solver, the authors in [33] suggested partial symmetry breaking. The idea is to break a few symmetries whose SBPs produce short CNF clauses. This would provide better runtime and memory trade-offs. The partial symmetry breaking ideas include creating permutation predicates for a subset, instead of all, generators. Irredundant generators are good candidates for

Symmetry 2010, 2

1129

symmetries to be broken because they cannot be expressed in terms of each other. Another idea is to create bit predicates for the first k, instead of all bits in a permutation. As an example, the following formula:

f (a,b, c, d ) = (a ∨ b ∨ c ∨ d ) ⋅ (a ∨ b ∨ c ∨ d ) ⋅ (a ∨ b ∨ c ∨ d ) ⋅ (a ∨ b ∨ c ∨ d ) (5)

has a total of 11 permutations. Two of the permutations represent the irredundant set of generators, namely: ϕ1 = (ab)(cd ) and ϕ2 = (abc) . Producing the SBPs for the two generators is more efficient than generating the SBPs for all 11 permutations, yet should still be able to lead to significant savings in SAT search runtime.

7. Experimental Results

In this section, we empirically show the advantage of breaking symmetries in CNF formulas. The experiments were performed on an Intel Xeon 3 Ghz machine with 4 GB of RAM running Linux. The runtime limit for all experiments was set to 1000 seconds. The benchmarks included the pigeon-hole (hole) [34], global routing (s3) [21], FPGA routing (fpga and chnl) [32], randomized Urquhart (urq) [35], and xor-chains (xor) [36]. We used the SAT solvers MiniSAT [13] and RSat [37] to test the advantage of SBPs. MiniSAT and RSat won the gold medals in the industrial category in the SAT 2005 and SAT 2007 competitions [38], respectively. Saucy [29] was used to detect the symmetries and only the irredundant generators were used to generate the SBPs. The tool Shatter [32] was used to generate the SBPs.

Table 3 shows the number of symmetries and generators identified in each instance. The numbers clearly show the significant savings obtained when generators are used to represent the complete set of symmetries. Furthermore, it is clear that all instances contain an exponential number of symmetries.

Table 3. Symmetry statistics of the tested instances.

Instance

S/U

chnl10_11 UNS

chnl10_12 UNS

chnl10_13 UNS

chnl11_12 UNS

chnl11_13 UNS

chnl11_14 UNS

fpga12_10_sat SAT

fpga12_11_sat SAT

fpga12_9_sat SAT

fpga13_10_sat SAT

fpga13_11_sat SAT

fpga13_12_sat SAT

hole10

UNS

hole11

UNS

# of Symmetries

4.19631E + 28 6.04269E + 30 1.02121E + 33 7.31165E + 32 1.23567E + 35 2.42191E + 37 5.41777E + 16 1.78786E + 18 5.41777E + 14 1.89622E + 17 1.2515E + 19 9.01083E + 20 1.4485E + 14 1.91202E + 16

# of Generators

39 41 43 43 45 47 28 29 25 28 30 32 19 21

Symmetry 2010, 2

1130

Table 3. cont.

hole12

UNS 2.98275E + 18

23

hole13

UNS 5.42861E + 20

25

hole8

UNS 14631321600

15

hole9

UNS 1.31682E + 12

17

s3-3-3-10

SAT 34828517376

28

s3-3-3-1

SAT 8707129344

26

s3-3-3-3

SAT 69657034752

29

s3-3-3-4

SAT 26121388032

27

s3-3-3-8

SAT 34828517376

28

Urq3_5

UNS 536870912

29

Urq4_5

UNS 8.79609E + 12

43

Urq5_5

UNS 4.72237E + 21

72

x1_16

UNS

131072

17

x1_24

UNS

16777216

24

x1_32

UNS 4294967296

32

x1_36

UNS 68719476736

36

TOTAL

2.43444E + 37

941

Table 4 compares the MiniSAT and RSat solvers’ runtimes for the original CNF instances and the instances augmented with SBPs. The table also shows the time needed by Saucy to detect all symmetries and the instance’s result as satisfiable (SAT) or unsatisfiable (UNS). Several observations are in order: (1) the symmetry detection runtime is negligible in most cases, (2) the symmetry detection runtime and SAT search runtime are not correlated, and (3) the addition of SBPs for the generators only significantly reduces the SAT search runtime in most cases. The only exception is some of the satisfiable global routing instances which are easily solved by the SAT solvers in their original case, nevertheless, the SAT solvers are also able to easily solve the instances with the augmented SBPs. Similar results using different solvers and machines were obtained in [32,33].

Table 4. SAT solver runtimes for the original instances and the instances augmented with SBPs. Symmetry detection runtimes are also provided. All runtimes are in seconds.

Instance

S/U

chnl10_11 UNS chnl10_12 UNS chnl10_13 UNS chnl11_12 UNS chnl11_13 UNS chnl11_14 UNS fpga12_10_sat SAT fpga12_11_sat SAT fpga12_9_sat SAT

Saucy Time (Sec)

0.04 0.05 0.07 0.07 0.08 0.09 0.04 0.05 0.03

RSAT Time (Sec)

Orig w/SBP Speedup

>1000 >1000 >1000 >1000 >1000 >1000 0.01 0.01 0.04

0.01 0.01 0.01 0.01 0.01 0.01 0.001 0.001 0.01

>100000 >100000 >100000 >100000 >100000 >100000

10 10 4

MiniSAT Time (Sec)

Orig w/SBP Speedup

67.92 115.26 108.97 520.44 >1000 >1000

0.01 >1000 0.01

0.01

6792

0.01 11526

0.01 10897

0.01 52044

0.01 >100000

0.01 >100000

0.01

1

0.01 >100000

0.01

1

OPEN ACCESS

symmetry

ISSN 2073-8994 www.mdpi.com/journal/symmetry

Symmetry in Boolean Satisfiability

Fadi A. Aloul

American University of Sharjah, P.O. Box 26666, Sharjah, United Arab Emirates; E-Mail: [email protected]

Received: 31 December 2009; in revised form: 14 May 2010 / Accepted: 9 June 2010 / Published: 11 June 2010

Abstract: This paper reviews recent approaches on how to accelerate Boolean Satisfiability (SAT) search by exploiting symmetries in the problem space. SAT search algorithms traverse an exponentially large search space looking for an assignment that satisfies a set of constraints. The presence of symmetries in the search space induces equivalence classes on the set of truth assignments. The goal is to use symmetries to avoid traversing all assignments by constraining the search to visit a few representative assignments in each equivalence class. This can lead to a significant reduction in search runtime without affecting the completeness of the search.

Keywords: boolean satisfiability; symmetries; search

1. Introduction

The last few years have seen a remarkable growth in the use of Boolean Satisfiability (SAT) models and algorithms for solving various problems in Engineering and Computer Science. This is mainly due to the fact that SAT algorithms have seen tremendous improvements in the last few years, allowing larger problem instances to be solved in different application domains. Such applications include formal verification [1], FPGA routing [2], power estimation [3], fault tolerance [4], network assignments [5], wireless communications [6], and scheduling [7]. SAT has also been extended to a variety of applications in Artificial Intelligence, including other well-known NP-complete problems.

SAT solvers have traditionally been used to solve decision problems. Given a set of Boolean variables and constraints expressed in products-of-sum form (also known as conjunctive normal form (CNF)), the goal is to identify a variable assignment that will satisfy all constraints in the problem or prove that no such assignment exists. It is well known that the SAT problem is NP-complete [8] and that any algorithmic approach for solving it will require at least worst-case exponential time in its size.

Symmetry 2010, 2

1122

Most powerful SAT solvers today are search based and use intelligent techniques to explore new regions of the search space while looking for a satisfying assignment. These intelligent techniques helped extend the application of SAT solvers to large problem instances consisting of thousands of variables and millions of constraints. Despite these advances, the tremendous growth in today’s designs is continuously outpacing the capabilities of existing SAT solvers. Many SAT instances remain hard to solve due to the significant size and complexity of the underlying systems they represent.

Prasad et al. observed that Boolean functions arising in Engineering applications are “structured” [9]. Unlike random problems, structured problems have non-uniform distributions and consist of clusters of variables. In this paper, we summarize the main findings in exploiting and analyzing the SAT instance’s structure to improve the search process. Specifically, we review an approach to accelerate SAT search by exploiting symmetries in the instance. The approach consists of an automated flow for (1) finding all syntactic symmetries of CNF formulas and (2) utilizing the symmetries to add symmetry-breaking predicates that effectively prune the search space and accelerate SAT solvers. Presented experimental results show the advantage of symmetry breaking for search algorithms.

The remainder of this paper is organized as follows. Section 2 presents an overview of Boolean Satisfiability. Section 3 describes basic symmetry definitions and notations. In Section 4, graph automorphism, which is used to identify symmetries, is described. Symmetry detection and breaking are discussed in Sections 5 and 6, respectively. Experimental results are presented in Section 7. Finally, the paper is concluded in Section 8.

2. Boolean Satisfiability

Recent years have seen significant advances in Boolean Satisfiability (SAT) solving. These advances have led to the successful deployment of SAT solvers in a wide range of problems in Engineering and Computer Science. The SAT problem involves finding an assignment to a set of binary variables that satisfies a given set of constraints or proving that no such assignment exists. In general, these constraints are expressed in products-of-sum form, also known as conjunctive normal form (CNF). SAT constraints will be referred to as clauses in the paper. A CNF formula, ϕ on n binary variables, x1,..., xn , consists of the conjunction (AND) of m clauses, ω1,...,ωm , each of which consists of the disjunction (OR) of literals. A literal is an occurrence of a Boolean variable or its complement.

As an example, the CNF instance:

f (a,b, c) = (a ∨ b) ⋅ (b ∨ c)

(1)

consists of three variables, two clauses, and four literals. The assignment {a = 1, b = 0, c = 0} leads to a conflict, whereas the assignment {a = 1, b = 0, c = 1} satisfies f. Note that a problem with n variables will have 2n possible assignments to test. The above example with three variables has eight possible assignments. An instance with 100 variables will have 1.27e + 30 assignments. Assuming a processor that can verify an assignment every one nanosecond, the processor will complete testing all 2100 assignments in 4e + 12 years.

Symmetry 2010, 2

1123

Despite the SAT problem being NP-Complete [8], there have been dramatic improvements in SAT solver technology over the past decade. This has led to the development of several powerful SAT algorithms that are capable of solving problems consisting of thousands of variables and millions of constraints. Such solvers include Grasp [10], zChaff [11], Berkmin [12], MiniSAT [13].

Most powerful SAT solvers are based on the original Davis-Putnam-Logemann-Loveland (DPLL) backtrack search algorithm [14]. The algorithm performs a depth first search process that traverses the space of 2n variable assignments until a satisfying assignment is found (the formula is satisfiable), or all combinations have been exhausted (the formula is unsatisfiable). The search process proceeds as follows. Originally, all variables are unassigned. The algorithm begins by choosing a decision assignment to an unassigned variable. A decision tree is maintained to keep track of variable assignments. An example of a decision tree is shown in Figure 1. After each decision, the algorithm determines the implications of the assignment on other variables. This is obtained by forcing the assignment of the variable representing an unassigned literal in an unresolved clause, whose all other literals are assigned to 0, to satisfy the clause. This is referred to as the unit clause rule. If no conflict is detected, the algorithm makes a new decision on a new unassigned variable. Otherwise, the backtracking process un-assigns one or more recently assigned variables and the search continues in another area of the search space.

Figure 1. An example of a satisfiable SAT instance showing its corresponding decision tree.

SAT solvers have been extended with several powerful algorithms to further expedite the search process. One of the best algorithms is known as the conflict analysis procedure [10] and has been implemented in almost all SAT solvers. Whenever a conflict is detected, the procedure identifies the causes of the conflict and augments the clause database with additional clauses, known as conflict-induced clauses, to avoid regenerating the same conflict in future parts of the search process. In essence, the procedure performs a form of learning from the encountered conflicts. Significant

Symmetry 2010, 2

1124

speedups have been achieved with the addition of conflict-induced clauses, as they tend to effectively prune the search space.

Intelligent decision heuristics and random restarts [11] also played an important role in enhancing the SAT solvers performance. The developers of the state-of-the-art SAT solver, zChaff [11], proposed an effective decision heuristic, known as VSIDS, and implemented several other enhancements, including random restarts, which lead to dramatic performance gains on many CNF instances.

Over the years, SAT has been directly applied to various electronic design automation (EDA) computational tasks. In general, the overhead of reducing many of these problems to SAT is small. Below, we briefly survey a number of EDA applications to which SAT has been successfully introduced.

One of the important applications of SAT is circuit verification. Due to the increasing complexity of modern hardware designs, verifying the correctness of these designs is becoming increasingly difficult. SAT has been successfully applied to formal verification, which has attracted much interest from industry, since unlike simulation, which only reveals the presence of bugs, formal verification can prove their absence. Specifically, combinational equivalence checking has been solved using SAT [15], and later, several attempts have been made to solve sequential equivalence checking problems using SAT [16]. Additionally, SAT has been used for bounded model checking [1], microprocessor verification [17], and functional vector generation [18]. Besides verification, SAT has also been heavily applied in automatic test pattern generation (ATPG) [19] and extended to delay fault testing [20]. SAT has also been used in FPGA routing [2], global routing [21], logic synthesis [22], cross talk noise analysis [23], power estimation [3], and power leakage estimation [24].

Additionally, SAT has been extended to a variety of applications in Artificial Intelligence, including other well known NP-complete problems such as graph colorability, vertex cover, hamiltonian path, and independent sets [25].

3. Symmetry Definitions and Notations

The symmetry of a discrete object is a permutation of its components that leaves the object intact. For example, the rotations of a spatial solid, e.g., a cylinder, that leave its shape unchanged or the negation of a Boolean variable in a Boolean formula, e.g., x in (x + y) ⋅ (x + y) ⋅ ( y + z) , that does not affect the formula or the function it represents.

Permutational symmetries can be classified as either syntactic or semantic symmetries. Given a Boolean function, a semantic symmetry is a permutation of variables that does not change the value of the function under any variable assignment. On the other hand, a syntactic symmetry, also known as a structural symmetry, is a permutation that does not change the representation of the function. Hence, a syntactic symmetry is also a semantic symmetry. In this paper, we focus on using syntactic symmetries to improve SAT search.

In this paper, we will use the cycle notation to express the permutations. For example, a permutation that swaps elements 2 and 3, and maps 4 to 5, 5 to 6, and 6 to 4, while mapping all other elements to themselves will be represented as ϕ = (23)(456) .

In general, the number of permutations in a permutation group can be exponentially large. Rather than explicitly representing the set of permutations in a group, they are typically represented implicitly

Symmetry 2010, 2

1125

using irredundant sets of generators. A set of generators is a set of group elements such that any other group element can be obtained from the products of the generators.

As an illustration, consider the permutations ϕ1 = (12) and ϕ2 = (23) expressed in cycle notation. Taking the composition, i.e., product, of ϕ1 ⋅ϕ2 generates the permutation ϕ3 = (123) , which is the result of performing ϕ2 followed by ϕ1.

A set of generators is irredundant if it is not possible to express any of its elements as a product of other elements. An interesting observation from group theory is that an irredundant set of generators for a group with N > 1 elements contains at most log2 N elements [26]. For example, the group of all permutations on k elements has k! permutations, but can be efficiently generated by only two generators: ϕ1 = (12) and ϕ2 = (12…k) . Thus, the use of irredundant sets of generators to express the complete set of permutations ensures exponential compression.

In this paper, we will assume a total ordering x1 < x2 < …< xn of the variables x1, x2 ,.…, xn and consider the induced lexicographic ordering of the 2n truth assignments, i.e., 0–1 strings of length n. Given an equivalence partition on these assignments, the lex-leader of a particular equivalence class, i.e., orbit, is defined as its lexicographically smallest element. Adding a lex-leader predicate ensures that a Boolean function evaluates to true only on lex-leaders of orbits.

For example, consider a Boolean function with four variables. Assume an orbit consists of the assignments {0100, 1000, 0010, 0001}. The lex-leader of the given orbit is {0001}. Adding a lexleader predicate will falsify the first three assignments.

In terms of a CNF formula, a permutation of literals is a symmetry of a given CNF formula if Boolean consistency is observed and the formula is preserved under the permutation. In other words, every clause must map into a clause with the same polarities of literals. The addition of a symmetrybreaking predicate (SBP), expressed as a set of CNF clauses, allows only one of many equivalent variable assignments to be a potential solution to the formula. An SBP is a full SBP if it selects exactly a single element from each orbit; otherwise it is known as a partial SBP. A lex-leader SBP (LL-SBP) is an SBP that selects lex-leaders only and is also classified as a full SBP. It is essential for partial SBPs, however, to select lex-leaders among other elements. Such SBPs are known as partial lex-leader SBPs (PLL-SBPs). Note that the addition of an SBP to the original CNF formula does not affect its satisfiability, but restricts the possible solutions to those selected by the SBP. In other words, if the original formula is satisfiable, the number of solutions may decrease considerably after pre-processing. However, if the original instance is unsatisfiable, the “number of equivalent paths leading nowhere” will be reduced and the SAT solver is likely to conclude faster that no solution exists.

As an example, assume the CNF formula:

f (a,b, c, d, e, f , g) = (c ∨ e ∨ f )(a ∨ c)(b ∨ c)(d ∨ e)(d ∨ f )

(2)

The group of all permutations of the CNF formula consists of four permutations {E, (ab), (dd )(ef ), (ab)(dd )(ef )} that can be efficiently obtained from the products of the two generators ϕ1 = (ab) and ϕ2 = (dd )(ef ) . Note that E is the identity symmetry. Permuting the literals of the formula according to all four permutations yields the same formula. This is shown in Table 1.

Symmetry 2010, 2

1126

Table 1. Permuting the literals of the CNF formula in (2) according to all its symmetries.

Permutation

E

(ab) (dd )(ef ) (ab)(dd )(ef )

CNF Formula (c ∨ e ∨ f ) ⋅ (a ∨ c) ⋅ (b ∨ c) ⋅ (d ∨ e) ⋅ (d ∨ f ) (c ∨ e ∨ f ) ⋅ (b ∨ c) ⋅ (a ∨ c) ⋅ (d ∨ e) ⋅ (d ∨ f ) (c ∨ f ∨ e) ⋅ (a ∨ c) ⋅ (b ∨ c) ⋅ (d ∨ f ) ⋅ (d ∨ e) (c ∨ f ∨ e) ⋅ (b ∨ c) ⋅ (a ∨ c) ⋅ (d ∨ f ) ⋅ (d ∨ e)

4. Graph Automorphism

The problem of identifying all symmetries is known as the graph automorphism problem [27]. Given a graph, a symmetry, i.e., an automorphism, is a permutation of its vertices that maps edges to edges. In the case of directed graphs, edge orientations must be maintained. In terms of complexity, there are no known worst-case polynomial-time algorithms for solving the graph automorphism problem. Nevertheless, the problem is rarely difficult in practice and is not believed to be NP-complete nor to be in P. Algorithms typically finish in linear-time if the symmetry is trivial, and in polynomialtime for graphs of bounded vertex degree [28].

The complexity of solving the graph automorphism problem can be further simplified by the use of vertex labels. Each vertex is labeled by a color or an integer and the goal of the problem is reduced to mapping vertices to vertices of the same label.

In this paper, we will use the powerful graph automorphism tool, Saucy [29].

5. Detecting Symmetries in CNF Formulas

In this section, we describe how to identify symmetries in CNF formulas using graph automorphism [27]. The main idea is to express a CNF formula as an undirected graph such that the symmetry group of the graph is isomorphic to the symmetry group of the CNF formula. The first attempt to identify such symmetries by modeling CNF formulas as graphs was proposed in [30], and subsequently refined by Crawford et al. [31]. The construction proceeds as follows. Assuming a CNF formula with V variables and C clauses, of which C2 are binary clauses and Cx are clauses of size three or more (single literal clauses are removed by preprocessing the CNF formula), a graph is constructed as follows:

• A single vertex is created for each clause in Cx . • Two vertices are created for each variable, representing its positive and negative literals. • Edges are added connecting a clause vertex to its respective literal vertices.

Vertices corresponding to clauses, positive literals, and negative literals are colored differently. Boolean consistency is ensured by adding an extra vertex for each variable and two edges connecting the new vertex to the literals of that variable. The new set of vertices is assigned a new label. Clauses in C2 are represented by adding a single edge between the two literals of the clause. This construction yields a total of 3V + Cx vertices with four unique colors of vertices. An example of the construction is shown in Figure 2(a). Note that the presented construction cannot detect phase shift symmetries, i.e., symmetries of the form x → x .

Symmetry 2010, 2

1127

The construction in [31] was further improved in [32] to handle phase-shift symmetries and their compositions with permutational symmetries. The construction in [32] also produces a smaller graph compared to previous methods. Assuming a CNF formula with V variables and C clauses, of which C2 are binary clauses and Cx are clauses of size three or more (single literal clauses are removed by preprocessing the CNF formula), a graph is constructed as follows:

• A single vertex is created for each clause in Cx . • Two vertices are created for each variable, representing its positive and negative literals. • Edges are added connecting a clause vertex to its respective literal vertices.

Vertices corresponding to clauses and variables are colored differently. Boolean consistency is ensured by adding an edge between vertices of opposite literals. Clauses in C2 are represented by adding a single edge between the two literals of the clause. The construction yields a total of 2V + Cx vertices with two unique colors of vertices. An example of the construction is shown in Figure 2(b).

Figure 2. Conversion of the CNF formula (x ∨ y ∨ z) ⋅ (x ∨ y ∨ z) ⋅ ( y ∨ z) to a graph for symmetry extraction purposes using the constructions (a) in [31] and (b) in [32]. Different shapes correspond to different vertex colors.

6. Breaking Symmetries in CNF Formulas

Once the symmetries are identified, the next step is to break them in the CNF formula. This is accomplished by adding symmetry breaking predicates (SBPs) that choose lex-leaders (LL), i.e., lexicographically smallest assignments, in each orbit or equivalence class.

Crawford et al. [31] laid the theoretical foundation for constructing SBPs for CNF formulas that possess permutational symmetries. Their construction assumes a given variable ordering and consists of two nested conjunctions:

• An outer conjunction over all permutations in the group of symmetries. • An inner conjunction over all variables in the permutation.

This results in the selection of just the lex-leaders from each orbit and breaks all symmetries. Given a group of symmetries ∏ = {π1,…,π m} for a CNF formula defined over a set of totally-ordered variables x1 < x2 < … < xn , the LL-SBP is defined as follows:

Symmetry 2010, 2

1128

PP(π ) = ∩ [ ∩(x j = xπj )] → (xi ≤ xiπ )

(3)

1≤i ≤n i ≤ j ≤i −1

LL − SBP(∏) = ∩ PP(xi )

(4)

1≤i ≤m

We will refer to PP(π ) as the permutation predicate for permutation π . By introducing n “equality” variables, e j ≡ (x j = xπj ) , each PP can be translated to a CNF formula with 5n clauses and 0.5n2 + 13.5n literals. This is derived as follows. For each PP, we introduce variables e1,…, en . For each e j variable, we add four clauses, each consisting of three literals, in order to define the equality relationship with (x j = xπj ) . Furthermore, the outer conjunction over all n variables in (3) produces n

clauses of increasing sizes; the first, second, and n-th clause have a size of two, three, and n + 1 literals, respectively. Hence, all n clauses have a total of (n2 + 3n) / 2 literals. An example of this

construction is shown in Table 2.

Table 2. Example showing the creation of SBPs for the single-cycle permutation (abcd) using the formula in (3).

SBP

Clauses

This approach breaks all permutational symmetries and will be referred to as full symmetry breaking. The downside is that the number of symmetries in the symmetry group is usually exponential in the number of problem variables, making such full symmetry breaking impractical. This is addressed in [31] by building a symmetry tree and pruning it to remove unnecessary duplication. This, unfortunately, still does not preclude the need to consider an exponential number of permutations in order to break all symmetries.

For symmetry breaking to be effective in practice, the computational overhead of generating and manipulating the SBPs must be significantly less than the runtime savings they yield due to search pruning. Since breaking all symmetries can lead to an exponentially large SBP, which will significantly slow down the SAT solver, the authors in [33] suggested partial symmetry breaking. The idea is to break a few symmetries whose SBPs produce short CNF clauses. This would provide better runtime and memory trade-offs. The partial symmetry breaking ideas include creating permutation predicates for a subset, instead of all, generators. Irredundant generators are good candidates for

Symmetry 2010, 2

1129

symmetries to be broken because they cannot be expressed in terms of each other. Another idea is to create bit predicates for the first k, instead of all bits in a permutation. As an example, the following formula:

f (a,b, c, d ) = (a ∨ b ∨ c ∨ d ) ⋅ (a ∨ b ∨ c ∨ d ) ⋅ (a ∨ b ∨ c ∨ d ) ⋅ (a ∨ b ∨ c ∨ d ) (5)

has a total of 11 permutations. Two of the permutations represent the irredundant set of generators, namely: ϕ1 = (ab)(cd ) and ϕ2 = (abc) . Producing the SBPs for the two generators is more efficient than generating the SBPs for all 11 permutations, yet should still be able to lead to significant savings in SAT search runtime.

7. Experimental Results

In this section, we empirically show the advantage of breaking symmetries in CNF formulas. The experiments were performed on an Intel Xeon 3 Ghz machine with 4 GB of RAM running Linux. The runtime limit for all experiments was set to 1000 seconds. The benchmarks included the pigeon-hole (hole) [34], global routing (s3) [21], FPGA routing (fpga and chnl) [32], randomized Urquhart (urq) [35], and xor-chains (xor) [36]. We used the SAT solvers MiniSAT [13] and RSat [37] to test the advantage of SBPs. MiniSAT and RSat won the gold medals in the industrial category in the SAT 2005 and SAT 2007 competitions [38], respectively. Saucy [29] was used to detect the symmetries and only the irredundant generators were used to generate the SBPs. The tool Shatter [32] was used to generate the SBPs.

Table 3 shows the number of symmetries and generators identified in each instance. The numbers clearly show the significant savings obtained when generators are used to represent the complete set of symmetries. Furthermore, it is clear that all instances contain an exponential number of symmetries.

Table 3. Symmetry statistics of the tested instances.

Instance

S/U

chnl10_11 UNS

chnl10_12 UNS

chnl10_13 UNS

chnl11_12 UNS

chnl11_13 UNS

chnl11_14 UNS

fpga12_10_sat SAT

fpga12_11_sat SAT

fpga12_9_sat SAT

fpga13_10_sat SAT

fpga13_11_sat SAT

fpga13_12_sat SAT

hole10

UNS

hole11

UNS

# of Symmetries

4.19631E + 28 6.04269E + 30 1.02121E + 33 7.31165E + 32 1.23567E + 35 2.42191E + 37 5.41777E + 16 1.78786E + 18 5.41777E + 14 1.89622E + 17 1.2515E + 19 9.01083E + 20 1.4485E + 14 1.91202E + 16

# of Generators

39 41 43 43 45 47 28 29 25 28 30 32 19 21

Symmetry 2010, 2

1130

Table 3. cont.

hole12

UNS 2.98275E + 18

23

hole13

UNS 5.42861E + 20

25

hole8

UNS 14631321600

15

hole9

UNS 1.31682E + 12

17

s3-3-3-10

SAT 34828517376

28

s3-3-3-1

SAT 8707129344

26

s3-3-3-3

SAT 69657034752

29

s3-3-3-4

SAT 26121388032

27

s3-3-3-8

SAT 34828517376

28

Urq3_5

UNS 536870912

29

Urq4_5

UNS 8.79609E + 12

43

Urq5_5

UNS 4.72237E + 21

72

x1_16

UNS

131072

17

x1_24

UNS

16777216

24

x1_32

UNS 4294967296

32

x1_36

UNS 68719476736

36

TOTAL

2.43444E + 37

941

Table 4 compares the MiniSAT and RSat solvers’ runtimes for the original CNF instances and the instances augmented with SBPs. The table also shows the time needed by Saucy to detect all symmetries and the instance’s result as satisfiable (SAT) or unsatisfiable (UNS). Several observations are in order: (1) the symmetry detection runtime is negligible in most cases, (2) the symmetry detection runtime and SAT search runtime are not correlated, and (3) the addition of SBPs for the generators only significantly reduces the SAT search runtime in most cases. The only exception is some of the satisfiable global routing instances which are easily solved by the SAT solvers in their original case, nevertheless, the SAT solvers are also able to easily solve the instances with the augmented SBPs. Similar results using different solvers and machines were obtained in [32,33].

Table 4. SAT solver runtimes for the original instances and the instances augmented with SBPs. Symmetry detection runtimes are also provided. All runtimes are in seconds.

Instance

S/U

chnl10_11 UNS chnl10_12 UNS chnl10_13 UNS chnl11_12 UNS chnl11_13 UNS chnl11_14 UNS fpga12_10_sat SAT fpga12_11_sat SAT fpga12_9_sat SAT

Saucy Time (Sec)

0.04 0.05 0.07 0.07 0.08 0.09 0.04 0.05 0.03

RSAT Time (Sec)

Orig w/SBP Speedup

>1000 >1000 >1000 >1000 >1000 >1000 0.01 0.01 0.04

0.01 0.01 0.01 0.01 0.01 0.01 0.001 0.001 0.01

>100000 >100000 >100000 >100000 >100000 >100000

10 10 4

MiniSAT Time (Sec)

Orig w/SBP Speedup

67.92 115.26 108.97 520.44 >1000 >1000

0.01 >1000 0.01

0.01

6792

0.01 11526

0.01 10897

0.01 52044

0.01 >100000

0.01 >100000

0.01

1

0.01 >100000

0.01

1