Predicting Intermediate and Multiple Conclusions
on Predicate-Logic Reasoning Problems:
Further Investigation of a Theory of Mental Logic
by
Jasminka Grgas©
Baruch College of the City University of New York
Running head: MENTAL PREDICATE LOGIC
Submitted to the Committee on Undergraduate Honors of Baruch College
of The City University of New York in partial fulfillment of the requirements
for the degree of Bachelor of Arts in Psychology with Honors.
Acknowledgement
Introduction
Experiment 1
Experiment 2
Contents of the "Other"
Responses in Experiments 1 and 2
General Discussion
References
Table 1
Table 2

I would like to thank my advisor, Professor David O'Brien for his support
and all help in this project, I am especially grateful for his guidance
in the field of cognitive psychology. I would also like to thank Professor
Joseph Hosie for emphasizing the importance of research in psychology,
and for his numerous suggestions.
Jasminka
The mental-logic theory (ML theory) proposed by Braine and O'Brien
(e.g., 1991, 1998) consists of the two parallel models--a mental propositional
logic and its extension to a mental predicate logic (Braine & O'Brien,
1998). The mental propositional logic addresses inferences that can
be drawn on the basis of logic particles such as those expressed with
English language words such as/f, and, or, and not. The mental predicate
logic provides further analyses of the internal composition of propositions,
including predicate/argument structure as well as quantifiers (e.g.,
all, some, none) and a way of representing their scope.
The research reported here was designed to provide additional empirical
support for the mental predicate logic. The logic inferences investigated
are claimed to be made both in reasoning and in discourse processing,
and since they are made routinely and easily, especially in discourse
processing, people often do not recognize that they are making any inferences
at all. The logic inferences are based on the meanings of English-language
particles and quantifiers such as if, and, or, not, all, some. ML theory
proposes that the meanings of these particles and quantifiers are given
by the inferences that they sanction.
The theory consists of a core and a pragmatic part. The core part includes
a set of inference schemas and a reasoning program that applies the
schemas in lines of reasoning. The ML inference schemas differ from
the sorts of schemas that are found in standard logic books in several
ways, e.g., they allow concatenation of more than two constituents,
but for simplicity of presentation the schemas are described here in
a simpler form. In addition, in standard logic anything follows from
contradictory premises, whereas in mental logic nothing would follow
from contradictory premises, except a judgment that something is wrong.
The pragmatic (noncore) part of the theory is concerned with pragmatic
principles that are involved in premise interpretation and to make inferences
that go beyond those made by the ML inference schemas (e.g., invited
pragmatic schemas). This part is not relevant to the experiments reported
here, and it will not be discussed further.
The theory makes a distinction between the following types of schemas:
core schemas, feeder schemas, incompatibility schemas, and some others.
People are usually more aware of the output of the core schemas and
apply them more freely than those of the feeder schemas. The core schemas
are applied when premises of the requisite form are active in working
memory and the premises are considered tree (can be treated as assumptions).
The feeder schemas are applied when their output satisfies the conditions
of application of a core schema.
In the partial list below, those schemas that are involved in investigation
reported here are presented. For each schema the propositional-level
version is given in the first row, followed by corresponding predicate
logic version(s). The notation is illustrated and explained following
the first three schemas.
Core Schemas:
| (1) |
p or q; ~p / q |
| |
S1[All X] OR S2[PRO-All X]; NEG S2[ ];
[ ]
[X] / S1[ ] |
| |
S1[All X] OR S2[PRO-All X] /
S2[All X: NEG S1[PRO]] |
Schema 1 is a disjunction-elimination schema: When one of two alternatives
is false, the other must be tree. The first of the predicate-logic versions
can be rendered in English as "All of the Xs satisfy predicate
S 1 or they satisfy S2; some particular object or set of objects,
,
does not satisfy S2;
is included among
the Xs; one can conclude that
satisfies
S 1." (The "PRO" notation usually is realized as a pronoun.
This way of treating quantificational scope differs from standard logic
and is closer to the structures of natural languages. For discussion
of the notational system, see Braine, in press.) The second predicate-logic
version can be rendered as "All of the Xs satisfy predicate S 1
or they satisfy S2; one can conclude that all of the Xs such that they
do not satisfy S 1 satisfy S2." An example of a problem of the
sort discussed later that uses the first predicate logic version of
this schema (referring to a beads of various colors, shapes, sizes,
etc.) presents All of the beads are green or they are small
and the round beads are not small; application of the schema
leads to the inference that the round beads are green.
| (2) |
If p THEN q; p /
q |
| |
S[All X]; [ ]
[X] /
S1[ ] |
| |
NEG S[~Some X~]; [ ]
[X] /
NEG S[ ] |
At the propositional level Schema 2 is standard logic's modus ponens.
The first of its predicate logic versions can be rendered as "All
of the Xs satisfy S; some particular object or set of objects,
,
is among the Xs; it can be concluded that a satisfies S. The second
can be rendered as "There is no X that satisfies S; some particular
object,
, is included among the Xs;
it can be concluded that a does not satisfy S." (The tildes around
"Some X" indicate that it is within the scope of the negation
and can be instantiated. "NEG S[Some X]" would indicate that
"some X is not S." One could not then conclude that a is not
an X. (Note that the meaning of the quantifier is given by the inferences
about instantiation, i.e., which objects can or cannot satisfy the predicate.)
An example of a problem that uses the second predicate-logic versions
of the schema (referring to some children in a school) has as premises
None of the children wearing red shins are playing basketball
and all the boys are wearing red shirts leads to the conclusion
that the boys are not playing basketball.
| (3) |
~(p & q); p /
~q |
| |
NEG E[~Some X : S1[PRO-ALL X] & S2[PRO]~]; S2[ ];
[ ]
[X] / NEG S1[ ] |
| |
NEG(S1[All X] & S2[PRO-All X] /
NEG S2[All X: S1[PRO]] |
Schema 3 concerns negative-conjunction elimination. The first predicate-logic
version can be rendered "There is not some X such that it satisfies
S1 and satisfies S2; some particular object,
,
satisfies S2;
is included among the
Xs; one can conclude that
does not
satisfy S1. The second predicate-logic version can be rendered "Not
all of the Xs satisfy both S1 and S2; one can conclude that the Xs that
satisfy S 1 do not satisfy S2." An example of a problem that uses
the propositional-level version of this schema (referring to a box containing
toy animals) has as premises It is false that there is both a camel
and a monkey in the box and there is a camel; one can infer that
there is not a monkey in the box.
| (4) |
p OR q; If p THEN r; If q THEN r /
r |
| |
S[All X] OR S2[PRO-All X]; S3[All X: S1[PRO]]; S3[All X: S2[PRO]]
/ S3[All X] |
| (5) |
p OR q; If p THEN r; If q THEN s /
r OR s |
| |
S1[All X] OR S2[Pro-All X]; S3[All X: S1[PRO]]; S4[All X; S2[PRO]]
/ S3[All X] OR S4[PRO-All
X] |
Principal Feeder Schemas:
| (6) |
p; q / p &
q |
| |
S1[All X]; S2[All X] /
S1 [All X] & S2[PRO-All X] |
| (7) |
p & q / p |
| |
S1[Q X] & S2[PRO-Q X] /
S2 [Q X] |
| |
(Q refers to any quantifier, e.g. all, some,
many, few). |
Incompatibility Schemas:
| (8) |
p; ~p /
incompatible |
| |
S[All X]; NEG S[Q X] /
incompatible |
| |
S[Q X]; NEG S[All X] /
incompatible |
| (9) |
p or q; ~p & ~q /
incompatible |
| |
S[All X] OR S2[All X]; NEG S1[Q X] AND NEG S2[PRO-Q X] /
incompatible |
| |
S[Q X] OR S2[PRO-Q X]; NEG S1[All X] & NEG S2[All X] /
incompatible |
The reasoning program that implements the inference schemas includes
a direct reasoning routine (DRR) and some indirect reasoning strategies
that go beyond the DRR. The theory predicts that inferences made through
application of the DRR are essentially available to everyone and are
made routinely and effortlessly. The DRR is considered to be the first
facility that is used in logical reasoning and it consists of three
parts. A preliminary procedure determines if there is a conclusion to
be evaluated. If there is a tentative conclusion of the form if-then,
the preliminary procedure adds its antecedent to the premise set and
treats its consequent as a conclusion to be evaluated. An evaluation
procedure leads to either a "true" or "false" response.
A "true" response results from conclusion being in the premise
set being inferred from the premise set by application of one or a combination
of the schemas. The "false" response is made when a proposition
reached that is incompatible on Schemas 8 or 9 with a premise or with
a proposition that has been inferred. An inference procedure applies
any core schema whenever its conditions have been met, i.e., whenever
its requisite propositions are considered conjointly in working memory;
the feeder schemas are applied only when their output would provide
for the conditionals of a core schema to be met (or a possible one-time
application to feed a conclusion). Finally, when a topic set is present
(either because of some strategic consideration or because it has been
provided), any core schema that makes an inference about that topic
is applied. Neither the schemas nor the reasoning program provide any
means for making indeterminacy judgments, i.e., that the truth or falsity
of some conclusion is uncertain given a set of premises, and the schemas
involved in making incompatibility judgments are not sufficient for
judging the consistency of large or complex premise sets.
Unlike the DRR, the indirect reasoning strategies are not claimed to
be universally available and their application requires effort (although
Braine, Reiser, and Rumain, 1984, reported that some strategies are
available to many college students, and are presumed to be available
in other populations). Consequently, ML theory predicts that inferences
requiring any of the indirect-reasoning strategies would be made far
less often than those that follow from DRR. The indirect-reasoning strategies
are not described here because they are not required on any of the problems
reported here.
Several sorts of supportive evidence have been reported to support
ML theory, although most of the investigations have addressed only the
propositional-level of the theory: The theory has predicted successfully
which reasoning problems people solve, the perceived relative difficulties
of those problems, the order in which intermediate inferences are made
in lines of reasoning, which logical inferences are made routinely and
effortlessly in text comprehension, and has established that those inference
are made on line as the information enters working memory.
The data reported by Braine et al. (1984) clearly support the most
basic prediction of ML theory, i.e., that inferences that follow from
application of the DRR will be made routinely, and those requiting reasoning
resources beyond the DRR will be made far less often. Participants were
presented with two types of problems: Fifty-four problems were solvable
by application of the propositional schemas and the DRR, and another
19 problems required reasoning strategies that go beyond the DRR. Each
problem presented a set of premises together with a conclusion to be
evaluated as tree or false. To minimize potential content interference
with solution, all problems referred to letters written on an imaginary
blackboard (e.g., "If there is an F on the blackboard, there is
a W."). Errors were not significantly associated with problem length,
and as was expected, almost no errors were made on the direct-reasoning
problems. On the problems that required more sophisticated reasoning
strategies, however, errors often were made. Subsequent investigations
(e.g., Braine, O'Brien, Noveck, Samuels, Fisch, Lea and Yang, 1995;
O'Brien, Braine, and Yang, 1994) provided further evidence for ML theory.
In these investigations participants were able to make the predicted
inferences both when the problems were presented with conclusions to
be evaluated, or with just premises from which participants were asked
to write down everything that follows, without any conclusion to be
evaluated. Again, as predicted, very few errors were made on direct
reasoning problems.
Braine et al. (1984) provided an additional sort of evidence to support
the claim that not only were their direct-reasoning problems being solved,
but that they were being solved in the way described by the DRR. The
participants were directed to rate the perceived relative difficulty
of each problem on a Lichert-type scale, and Braine et al. constructed
a regression model from the perceived-difficulty rating data that assigned
a weight to each schema. This enabled prediction of the difficulty of
each problem (as being equal to the sum of the weights of each schema
required for problem solution as predicted by the DRR). For example,
a problem with premises of the form p or q, If q then r, not both
r and s, and not p, and requiring evaluation of not s would
lead first to the application of Schema 1 to the first and last of the
premises, which yields q, then to application of Schema 2, which
yields r, and finally to application of Schema 3, which yields
not s; the predicted difficulty of this problem is the sum of
the difficulty weights for Schemas 1, 2, and 3. Correlations between
predicted and observed difficulties accounted for 66% of the variance
(53% with problem length partialed out), even when the weights were
obtained with one set of problems and the observed ratings were obtained
with another set of problems and different participants. Yang, Braine,
and O'Brien (1998) conducted a similar investigation of direct-reasoning
predicate-logic problems. Again, almost no errors were made in assessing
the conclusions and again the ratings predicted by the schema weights
correlated highly with the observed rating (69% of the variance; 56%
when problem length was partialed out). even when observed ratings came
from new problems and different participants than those used to generate
the schema weights.
The sort of evidence provided by Braine et al. (1984) and Yang et al.
(1998) is supportive of the mental-logic account, but only indirectly
addresses whether participants were constructing the predicted lines
of reasoning. A more direct sort of evidence has been reported for the
propositional-level schemas by Braine et al. (1995) and O'Brien et al.
(1994). In these studies, participants were asked to write down every
step in their reasoning process, i.e., to write things down in the order
that they figured things out. Some problems presented conclusions to
be evaluated and participants were asked to write down everything they
figured out on the way to their final judgment; other problems presented
only premises and on these problems participants were asked to write
down everything they could figure out from the premises in the order
that they figured things out.
Consider a problem presented in O'Brien et al. (1994), referring to
letters written on an imaginary blackboard, with premises of the form
N or P; not N, if P then H, if H then Z, and not both Z and
S. The DRR applies Schema 3 to the first two premises to infer P,
which then leads with the third premise to application of Schema 7 to
infer H, which then leads with the fourth premise to application of
Schema 7 to infer Z, and, finally, with the fifth premise, to application
of Schema 4 to infer not S. Now consider another problem from O'Brien
et al. with the same premises presented in the reverse order: not
both Z and S, if H then Z, if P then H, not N, and N or P.
The DRR is unable to apply any of the core schemas until all of the
premises have been read, applying Schema 4 to premises 4 and 5 to infer
P, then applying Schema 7 to infer H, then Schema 7 to infer Z,. then
Schema 3 to infer not S. Note that the DRR leads to the same sequence
of intermediate inferences and to the same final conclusion on both
problems. (A reasoner might use strategies that go beyond the DRR on
the latter problem, for example first inferring If H then not S,
but this does not lead to any additional inferences, and O'Brien et
al. found that the only commonly made inferences were those predicted
by the DRR). The order of predicted inferences is determined by the
order in which the Core schemas become available (not by the order in
which the premises are presented), and O'Brien et al. found that the
order in which participants wrote down inferences on both problems corresponded
to those predicted by the DRR.
Several investigations have provided evidence for the mental-logic
inferences in text processing (e.g., Lea, O'Brien, Fisch, Noveck, &
Braine, 1990; Lea, 1995), finding that the core inferences are made
routinely when their premises are embedded within short story vignettes;
further, these inferences are made so easily that people usually do
not realize that any inferences are being made at all. Unlike other
sorts of inferences made while reading, e.g., inferences from story
grammars, scripts, etc., which are made only when they are bridging
inferences, i.e., required to maintain textual coherence, the mental-logic
inferences are made so long as their requisite premises are held conjointly
in working memory.
There is, thus, an abundance of evidence in favor of the predictions
of ML theory, but to date only the Yang et al. (1998) studies described
earlier have assessed the predicate-logic schemas, and those studies
provided only indirect evidence that the predicate-logic reasoning problems
were being solved using the lines of reasoning predicted by the DRR
and the mental predicate-logic schemas. The motivation for the experiments
reported here was to provide some direct evidence for the lines of reasoning
predicted for such problems. The basic strategy was adopted from Braine
et al. (1995) and O'Brien et al. (1994). As described earlier, those
studies presented premises and required participants to write down everything
that could be figured out from the premises in the order in which things
were figured out. The problems presented here similarly required that
each step in the reasoning processes be written down.
For the predicate-logic problems presented here, participants were
asked to write down everything about the topic set that could be figured
out from the premises. Table 1 shows the line of inferences that are
predicted by the DRR for Problem Set 1. These problems were designed
to be maximally simple, in that the inferences predicted by the DRR
could be applied as each premise was read, i.e., the problems were constructed
so that inferences could be made in the same order as the premises were
presented. Problem Set 2 was identical, except that the order in which
the premises were presented was random. (Table 1 shows the order in
which the premises of Problem Set 2 were presented.) It was predicted
that the line of inferences written down on these problems would not
differ from those predicted for Problem Set 1. This prediction follows
from the principle that the order of inferences will be governed by
the availability of inference schemas rather than by the order in which
premises are encountered.
Experiment 2 replicated the problem forms presented in Experiment 1;
the problem content differed, however, between the two experiments.
Whereas the problems in Experiment 1 concerned beads of various shapes,
sizes, patterns, etc., the problems in Experiment 2 concerned the actions
and attributes of various groups of children. It was predicted that
the lines of inferences would not be altered by the change of problem
content.

Method
Participants.
Fifty undergraduate students who were enrolled in an introductory psychology
course at Baruch College participated to fulfill a course requirement.
Twenty six of the participants received Problem Set 1, and 24 received
Problem Set 2. Eleven of the participants either did not follow instructions
or failed to respond to every problem, and data from these are not included
in the reported results, leaving data from 21 participants for Problem
Set 1 and 18 for Problem Set 2.
Tasks and Procedures.
Twenty predicate-logic reasoning problems were constructed to constitute
Problem Set 1. The problems were constructed so that the predicate-logic
schemas can be applied as the premises were read. For example, Problem
1 (see Table 1) allows application of Schema 3 as soon as the first
two premises are read. This allows Schema 2 to be applied when the third
premise is read, and then to Schema 3 as the fourth premise is read.
Participants were told that the problems referred to some beads made
by a bead manufacturer. The beads have various colors (for example,
some are red, some are blue, some green), various shapes (for example,
some are round, some are square, some are triangular), various materials
(for example, some are plastic, some are metal, some are wooden), and
various patterns (for example, some are striped, some are plain). Each
problem referred to the beads in a particular bag. Each problem presented
some facts about the beads in that bag, and then asks a question about
what you can figure out from the facts. The facts were presented first,
and then, below a line, the question was presented. Participants were
told to write their answers in the space below the question, which asked
them to write down, in the order they figured things out, everything
that they could figure out about a topic that was presented. The problems,
their topics, and the predicted lines of reasoning are presented in
Table 1. A second set of problems (Problem Set 2) was constructed that
was identical to the set shown in Table 1, except that the premises
were presented in random order, thus requiring participants to search
for the premises that allow application of a schema. Table 1 indicates
within parentheses following each premise the order in which the premises
were presented in Problem Set 2.
The task was administered in small groups (n < 10 per group). Each
participant was presented one set of problems. Order of problems within
each problem set was randomized, with two random orders constructed.
Participants were assigned randomly to problem sets and problem orders.
Results and Discussion
In scoring participant's responses the following guidelines were used.
First, some participants occasionally wrote down premises. Since these
responses did not seem to be activated by any particular circumstances,
and could not be counted for or against ML theory or other theories,
they were omitted from all tallies. Second, participants infrequently
would repeat previously made responses, and since they were already
scored they were ignored second time. Third, occasionally a participant
would write down an inference with the form if-then, where the if-clause
was either a premise or a previously written-down inference. In these
cases, the if-clause seemed to be stating a reason for inferring the
then-clause, so only the then-clause was included in the scoring. Fourth,
in a few cases responses deviated from predicted response only by the
inclusion or omission of and. For example participants would occasionally
write down predicted inferences conjoined with a premise or the output
of another inference, or in instances in which the model predicted a
conjunction, participants sometimes wrote down the components of the
conjunction on separate lines. Those responses account for the optional
one-time use of feeder schemas at the readout stage and were not listed
separately. Finally, some subjects tended to write down negative inferences,
e.g., "the large beads are not red," by enumerating the possible
positive compliments, e.g., "the large beads are green, or blue,
and so forth." Such responses were scored as negative inferences.
The responses obtained from the participants were compared to the predicted
responses listed in Table 1. For each predicted inference in Table 1,
the proportion of participants who wrote down that inference is indicated
(Problem Set 1 first, followed by Premise Set 2). For the 20 problems,
ML theory predicts that 51 inferences would be written down. (ML theory
predicts that the output of the core schemas applied by the DRR would
be written down; previous investigations have reported that the output
of the feeder schemas in not typically written down, as these inferences
are thought to be paraphrases rather than inferences. For the 20 problems,
the theory predicts application 1 core schemas 51 times.) For the 21
participants receiving Problem Set 1, this leads to prediction of 1071
responses (i.e., inferences written down), of which 76% were written
down.
Inspection of Table 1 reveals that the proportions with which predicted
responses were made were not equal across all problems and inferences.
For example, on several problems some of the earlier inferences in the
predicted lines of reasoning tended to be written down less often than
the final inference (e.g., problems 2, 5, 7, 11, 17, 18, and 19). For
the most part the intermediate inferences that were not written down
involved schemas 1, 2, and 3 when they were applied early in a line
of reasoning; inferences made from application of the same schemas as
the last inference in a line of reasoning were almost always written
down. Thus, failure to write down such inferences early in a line of
reasoning does not seem to indicate that the inferences were not made,
but rather to indicate that they seemed less important than the final
output of the reasoning processes. This interpretation is supported
by the fact that over 95% of multiple inferences were written down in
the predicted order, suggesting that participants were constructing
the lines of reasoning that were predicted, but failed to write down
every step in the processes.
For Problem Set 2 with 18 participants the theory predicts 918 responses,
of which 80% were written down. Inspection of Table 1 reveals that the
data for Problem Set 2 were extremely similar to those for Problem Set
I. Most striking is the fact that the order of inferences written down
were consistent with those predicted by the DRR (94.25% of the time),
even though the premises were not presented in an order that was consistent
with such output unless the reasoning process was guided by the availability
of the schemas rather than by the order of premise input. It is not
obvious what theoretical account could be provided for this consistency
of output ordering except for the schema-availability account provided
by ML theory.
The only sort of problems on which participants did not conform consistently
with the predicted output of ML theory were those that required application
of schema 5. Even so, on these problems (problems 4 and 9) a majority
of participants wrote down the lines of reasoning predicted by ML theory,
although a large minority did not. Most of those participants who did
not write down the Predicted lines of reasoning on these two problems
instead wrote down lines of reasoning that were consistent with a supposition-of-alternative
strategy. This strategy, described above, constructs two suppositional
lines of reasoning, one under each of the two alternatives of a disjunctive
premise. On problem 3, for example, this sort of line of reasoning results
in the intermediate inference that the beads are wooden and square
or mental and triangular rather than the beads are square or
triangular. For Problem Set 1, such inferences constituted 29%
of the intermediate inferences on problem 3 and 38% on problem 9, and
taken together with the output predicted by the DRR, they made up 91%
and 95% of the responses to problems 3 and 9, respectively. For Problem
Set 2, such inferences constituted 44% of the intermediate inferences
on problem 3 and 22% on problem 9, and taken together with the intermediate
inferences predicted by the DRR, they constituted 66% and 89% of the
intermediate inferences on problems 3 and 9, respectively.
In summary, participants made the vast majority of both the intermediate
and final inferences predicted by ML theory. More importantly, these
inferences were almost always made in the predicted order, even when
the premises were not presented in an order that by itself was conducive
to such output.

Method
Participants.
Fifty-two undergraduate students who were enrolled in an introductory
psychology course at Baruch College participated to fulfill a course
requirement. Several participants either did not provide responses to
all problems or did not follow instructions, and their data are not
included, leaving a total of 21 participants for Problem Set 1 and 20
participants for Problem Set 2.
Tasks and Procedures.
The problems were identical in logical form to those in Experiment
1, but with different content. Unlike the problems of Experiment 1,
which referred to beads in a bag, the problems in Experiment 2 presented
narrative information about different groups of children in Brazil.
(By placing the children in the stories in an unfamiliar society, participants
should be less likely to import information from long-term memory into
their lines of reasoning.) Participants were told that the children
are in different places, are wearing different sorts of clothes, are
doing different sorts of things, and so forth. Each problem presents
some facts about the particular group of children for that problem.
Each problem presented a topic, and participants were told to write
down everything they could figure out about that topic from the facts
in the order that they figured things out. The premises for the problems
and the predicted inferences are shown in Table 2.
Results and Discussion
The scoring guidelines were same as those used in Experiment 1. Table
2 shows the proportions with which each of the predicted inferences
for each problem were given for the problems both in Problem Set 1 and
Problem Set 2. For Problem Set 1 a total of 1,071 inferences were predicted
(51 inferences x 21 participants), of which 76% were written down, and
for Problem Set 2 a total of 1020 inferences were predicted (51 inferences
x 20 participants), of which 84% were written down.
Inspection of Table 2 reveals a pattern of responses that is quite
similar to those of Experiment 1. Comparisons of responses that were
written down and inferences that were predicted were not distributed
equally across problems and inferences, and as in Experiment 1 participants
often failed to write down inferences early in lines of reasoning, but
almost always included final inferences, and this was the case also
in Experiment 2. Again, the strongest evidence that participants were
making inferences in the order predicted by the DRR was that 97% of
multiple inferences were written down in the order predicted in Problem
Set 1, and 96% of multiple inferences were written down in the predicted
order for the problems in Problem Set 2, where such an order was at
variance with the presented premise order. Thus, for the problems in
Experiment 2, as well as for their formal parallels in Experiment 1,
the order in which inferences were written down was predicted successfully
by the availability of the schemas rather than by the order in which
information was presented.
As in Experiment 1, on those problems requiring application of schema
5, e.g., problems 3 and 9, a large number of participants revealed lines
of reasoning that went beyond what is available on the DRR, instead
writing down inferences that are consistent with a supposition-of-alternatives
strategy. For example, problem 3 led to the older children are wearing
red shirts and selling Jornal do Brasil or they are wearing blue shirts
and selling 0 Globo. Given that Braine et al. (1995) did not report
the use of such a strategy on problems requiring schema 5 when the problems
were presented at the propositional rather than predicate level, the
question is raised as to whether the greater complexity of the information
in the predicate-logic level encourages reasoners to keep track of the
information more carefully, and following the supposition-of-alternatives
strategy allows just this.

Of course, not everything written down was an inference predicted by
ML theory. Knowing what metric to use to assess how many nonpredicted
inferences were written down is problematic, for the possibilities concerning
what could be written down, and how things could be written down, was
undefined. Some participants went beyond writing down inferences that
depend on the logic particles and quantifiers. For example, one participant
responded to problem 3 of Problem Set 2 by developing a narrative in
which the red and blue shirts worn by children selling the two sorts
of newspapers were colors signifying two drug gangs, "like the
Bloods and the Crips," and the two newspapers were a code for different
drags they were dealing. Inclusion of such extralogical inferences was
not included in the tabluations presented in Tables 1 and 2, and such
inferences are not germane to the question of whether the predicted
inferences are made. ML theory proposes that the inferences made from
application of the schemas can cohabit in the same lines of reasoning
with inferences from a variety of other sources, such as those following
from scripts, story grammars, and so forth, and there is nothing in
making such extralogical inferences that bears on whether the logic
inferences are being made. (Indeed, the participant who wrote down that
the colors signified gang affiliations also made the inferences predicted
by ML theory.) How often such inferences were made is difficult to quantify,
because there is no theory about them. How many inferences, for example,
should be counted when someone writes down that the shirts designate
different gangs selling different drags? Such inferences, however, clearly
were made much less often than those counted as predicted by ML theory
that were counted in Tables 1 and 2.
One possible source of nonpredicted inferences that were made concerns
invited inferences and conversational implicatures, e.g., interpreting
disjunction as exclusive rather than inclusive, or converting propositions
of the form All P are Q to All Q are P. Although such inferences
are the focus of much attention in the reasoning literature (see discussions
in Braine & O'Brien, 1998), they were relatively rare in the protocols
here. Another possible source of nonpredicted responses would be standard
logic, which would allow for many logical inferences that would not
be made by the schemas of ML theory. No such inferences were written
down by any participant. A final possible source of nonpredicted inferences
would be the use of the feeder schemas, e.g., schemas 6 and 7. Such
inferences were made, but they did not occur very often. In brief, the
only sort of inferences that were made frequently were those reported
in the results sections for Experiments 1 and 2.

The experiments reported here provide additional evidence for a mental
predicate logic. Unlike the investigation of Yang et al. (1998), which
provided only indirect evidence, the present studies provide direct
evidence that participants applied the proposed inference schemas. The
most persuasive evidence comes from Problem Sets 2 in both Experiments
1 and 2, in which participants wrote down inferences in the order predicted
by the availability of the schemas, even though the premise information
was not presented in a way that would lead to such lines of reasoning
otherwise. In comparison to the inferences predicted by ML theory, relatively
few inferences of any other sort were made in any systematic fashion.
The best explanation for the data reported in the present study, therefore,
seems to be the ML theory for reasoning with predicate-logic premises.
It is a fair question, of course, as to whether any other psychological
theory of reasoning could provide as equally good an account of these
data. Only two other theories are available that would claim to explain
such reasoning: the mental-logic theory of Rips (e.g., Rips, 1994) and
the mental-models theory of Johnson-Laird and his associates (e.g.,
Johnson-Laird & Byrne, 1991). Neither theory, however, seems capable
of providing a clear account of how the problems reported here would
be solved. Let us consider first Rips's theory. First, Rips's theory
allows for few inferences to be made without specific conclusions to
be tested. How the theory would make inferences when only a topic set
is provided is yet to be specified. Second, Rips's theory apparently
would lead to the prediction that many of the problems presented here
would be quite difficult, when, in fact, participants had little difficulty
in arriving at a final inference (in many cases without disagreement
among participants). Finally, it is a quantifier-free representational
system. In standard logic, a universally quantified sentence can be
represented with a universal quantifier followed by a conditional sentence,
e.g., All the red beads are plastic can be represented as For
every bead, if it is red then it is plastic. In Rips's system,
it becomes: If Red (x), Then Plastic (x), where x is the
individual variable BEADS and the universal quantifier is eliminated.
An existentially quantified sentence can be represented with an existential
quantifier followed by a conjunction, e.g., There are some red plastic
beads can be represented as There exist some beads that are
red and plastic. In Rips's system, it becomes, "Red (a)
and Plastic (a)," where a is a temporary name or a
constant that had not occurred in the preceding undischarged steps.
By using this quantifier-free representation, the inference rules defined
for a propositional-level logic may sometimes be used in quantified
predicate reasoning. The data reported here reveal no tendencies to
treat universal propositions as conditionals, nor existential propositions
as conjunctions, as they should according to Rips's theory.
The mental models approach of Johnson-Laird and his colleagues has
addressed reasoning with predicate-logic premises in two sets of work,
one concerning Aristotelian syllogisms and the other concerning what
they refer to as "multiple quantification." The two sets provide
quite different sorts of models, and of the two, the more pertinent
is the work on syllogisms. (The work on multiply quantified propositions
has been limited to whether various objects are, or are not, in the
same location; it has represented the quantifiers quite differently
than has the work about predicate syllogisms.) As an example of their
approach, consider the two premises, All beads are green and
All green things are round, which lead to the following models
(Johnson-Laird & Byrne, 1991, p. 121):
[b] g; [g] r; [[b] g] r
[b] g [g] r [[b] g]
...
The first two columns represent the first premise, with the first two
rows containing tokens for green beads and the third row (the ellipsis)
indicating the possibility of other objects. Columns three and four
represent the second premise, with the first two rows containing tokens
for round green things and the third row again indicating the possibility
of other objects. Finally, columns 5 - 7 represent the combination of
information from the models for premises one and two, with the first
two rows containing tokens for green round beads and the third row again
indicating the possibility of some other things. The square brackets
indicate exhaustivity; for example, in the models in columns one and
two, the brackets indicate that no further models can be included that
have a token for bead without having a token for green. The nested bracketing
in the models in rows 5 - 7 indicate that beads are exhausted in relation
to green, and green is exhausted in relation to round. The modelers
state that the final model supports the conclusion that All beads
are round. This way of representing quantified propositions can
be applied to premises of the sort presented in the problems reported
here, although not without encountering some difficulties. Consider
the premises all the beads are red and all the beads are
metal, which could lead to the following set of models (omitting
the redundant models, as will be done henceforth):
[b] r; [b] m; [b] r m
Note that the structure of this model differs somewhat from what Johnson-Laird
and his associates described above, in that the square brackets cannot
be nested because, unlike the Aristotelian syllogisms, these premises
contain no middle term. The final model, however, does appear to support
the conclusion that all beads are red metal beads, which was
the conclusion to be evaluated by subjects. Representation by models
of other problems is often less obvious. Consider the premise that there
are no square wooden beads. Johnson-Laird and Byme (1991, p. 120)
stated that a universal negative proposition, e.g., None of the
athletes is a baker, will be represented as:
[a]
[b]
...
Application of this structure to there are no square wooden beads
is problematic. Note that one cannot simply add one line to the model,
as such:
[b]
[w]
[s]
...
because to do so would preclude the possibility of there being a wooden
bead, or a square bead, or a square wooden thing that is not a bead,
and these possibilities clearly should be allowed. Indeed, the appropriate
model seemingly would include six explicit representations:
[b]
[w]
[s]
[b
w]
[b
s]
[w
s]
Note that adding one term to the proposition would expand the required
models, e.g., there are no large square wooden beads would require
12 explicit models. Given that models theory claims that the principal
source of difficulty in reasoning stems from limitations in working
memory, making complex or lengthy models intractable, such premises
would quickly make such problems intractable. Johnson-Laird, Byrne,
and Schaeken (1994) suggested that in such situations a reasoner would
seek simpler ways to model the information. What such a simpler way
would be for this sort of premise, however, is unclear. For example,
one might propose that there are no square wooden beads would be taken
to mean that there are no beads that are both wooden and square, leading
possibly to:
[b]
[s w]
or one might take the proposition to mean that wooden beads are not
square beads, and vice versa, leading possibly to:
[b s]
[b w]
The choice is not trivial; choosing one way over another to represent
the proposition leads to quite different final models, and thus quite
different conclusions would be drawn. Among the final model sets that
might be drawn from the premises of Problem 36, depending on how one
decides to represent the premises and treat combinations and their exhaustivity
are the following:
[b] t
[b g] s
...
or:
[btw]
[b tl
[bgs]
[b s]
or:
[[b] w] t
[bg] s
Note that not all of these models would lead in any obvious way to
the conclusion that would be drawn from application of the mental-logic
schemas, i.e., that it is the green beads that are not wooden. Until
the models theorists provide greater specificity to the way quantified
propositions are represented and combined, it remains problematic as
to how one should compare the models treatment of the problems presented
in the present work to the mental-logic treatment: The models theory
provides no clear guidance about how people will reason with these problems.
The conclusion is inescapable: To date ML theory provides the only
plausible account of reasoning on problems of the sort reported here.
ML theory predicted successfully which inferences would be drawn and
the orders of the intermediate inferences drawn on the way to making
a final inferences. Most significantly, ML theory successfully predicted
the order in which inferences would be written down, even when the premises
were presented in orders that did not correspond to the output of the
reasoning processes. Clearly, the present investigation only fills in
one part of a larger investigation into reasoning of a predicate-logic
sort. It does, however, provide some direct evidence that participants
were solving the problems in the way described by ML theory.

Braine, M.D.S., and O'Brien, DP. (Eds.) (in press). Mental
logic. Mahwah, NJ: Lawrence Erlbaum Associates.
Braine, M.D.S., O'Brien, DP., Noveck, I.A., Samuels, M., Fisch, SM.,
Lea, R.B., and Yang, Y. (1995). Predicting intermediate and multiple
conclusions in propositional logic inference problems: Further evidence
for a mental logic. Journal of Experimental Psychology:
General, 124 263-292
Braine, M.D.S., Reiser, B.J., and Rumain, B. (1984). Some empirical
justification for a theory of natural propositional logic. In G. Bower
(Ed.) The psychology of learning and motivation:
Advance in research and theory. (Vol. 18). New York: Academic
Press
Johnson-Laird, P.N., Byrne, R. MJ., and Shaeken, W. (1994). Propositional
Reasoning by Models. Psychological Review,
96, 658-673
Johnson-Laird, P. N., and Byrne, R.M.J. (1991). Deduction.
Hillsdale, NJ: Lawrence Erlbaum Associates
Lea, R.B. (1995). On-line evidence for elaborative logical inference
in text. Journal of Experimental Psychology: Learning,
Memory, and Cognition 21, 1469-1482
O'Brien, D.P., Braine, M.D.S., and Yang, Y. (1994). Propositional reasoning
by mental models? Simple to refute in principle and in practice. Psychological
Review 101, 711-724
Rips, L.J. (1990). Reasoning. Annual Review of
Psychology, 41,321-353. Yang, Y., Braine, M.D.S., and O'Bfien,
D.P. (in press)

Premises. Topics. and Responses Predicted by the Direct-Reasoning Routine
for Problem Set 1 in Experiment 1. Together With the Proportions With
Which Those Responses Were Given to Problem Set 1 and Problem Set 2
of Experiment 1