Q
Problem solve Get help with specific problems with your technologies, process and projects.

# Find duplicate sets of child rows

I have a parent and child tables. The parent has seqno as the primary key and information as to who made a call...

and where and which form they used, etc. The child table has seqno from the parent, the question numbers on the form, and finally the answer to the question. A call is a visit to a store using a specific form which has a variable number of questions depending on the form number. I am looking for a way to find duplicate calls, done by the same person using the same form and the answers being the same. If a single answer is different the row does not qualify as duplicate. Is there a set based solution for this?

From your description, I shall assume the tables look like this:

```Calls
seqno personid formid
301    12      56
302    16      75
303    12      56

301     1       'y'
301     2       'n'
301     3       'y'
302     1       'n'
302     2       'n'
302     3       'n'
303     1       'y'
303     2       'n'
303     3       'n'```

Person 12 filled in form 56 twice, but the answers were different the second time, so you don't want to qualify call 303 as a duplicate of 301.

I would start by joining the combined data from the Answers and Calls tables to itself, based on the question number and also on the parent personid and formid, and with the added condition that the parent seqno values be different, because you don't want to compare a call to itself --

```select something
inner
join Calls
on cseqno = seqno ) X
inner
inner
join Calls
on cseqno = seqno ) Y
on X.personid = Y.personid
and X.formid   = Y.formid
and X.question = Y.question
and X.seqno    > Y.seqno```

Notice that X.seqno is greater than Y.seqno -- if we had used "not equals" then since it's a self-join, each pair would come in twice. By selecting only X.seqno greater than Y.seqno, the later call will be considered the duplicate.

But what do we select? The answer is to count, for each call, the number of answers that don't match.

```select X.seqno, ' is duplicated by ', Y.seqno
then 0 else 1
end ) mismatches
from ...
group
by X.seqno, Y.seqno
then 0 else 1
end )
> 0```

If the answers are optional, such that one call might have an extra answer than another call for the same person for the same form, then you'll have to use an outer join and COALESCE in the CASE, and probably have to run both inequalities, X.seqno greater than Y.seqno and X.seqno less than Y.seqno. I didn't test that scenario.

This was last published in January 2003

## Content

Find more PRO+ content and other member only offers, here.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

• ### Blockchain data disruption aborning, IDC analyst says

Blockchain data technology disruption may be in the offing. IDC's Stewart Bond says architecture at the core of controversial ...

• ### Potent NoSQL architecture engaged for building new applications

Behind hyperbolic terms like 'digital transformation' are innovative systems of engagement. DataStax CEO Billy Bosworth joins ...

• ### Finance data governance program gets new start, broader focus

Businesses constantly need to evolve their programs for governing data. Nationwide's finance data governance team shares how it ...

• ### Cognitive computing technologies still deliver mixed results

Cognitive computing tools have come a long way in the last couple of years, but the notion of true cognitive businesses, built ...

• ### Deloitte report reveals the power of unstructured data analytics

The analysis of unstructured data and other so-called dark data types can deliver significant business value, according to a new ...

• ### Ease of use is top priority in selecting self-service analytics tools

A lot of factors go into a choosing a strong, modern BI tool. But several users say ease of use trumped all other considerations ...

## SearchSAP

• ### Future of Ariba Network on display at SAP Ariba Live

SAP Ariba Live shows enhancements to procurement software and looks at the future of the SAP Ariba Network, including machine ...

SAP Vora has been updated to include features that make it easier to deploy and use to get insights from Hadoop big data; SAP IBP...

• ### Chatbots provide faster self-service on SAP systems

The SAP SuccessFactors partnership with collaboration software vendor Slack is just one of many efforts to make little robotic ...

## SearchSQLServer

• ### Redgate tools help bring database DevOps to Visual Studio 2017

So far, data has been left out of the DevOps discussion. But DevOps for databases could grow if tools like Redgate's bundle ...

• ### Conquer real-time operational analytics in SQL Server 2016

Analytics helps improve business operations, and SQL Server 2016 users can master it -- in real time, using operational data in a...

• ### SQL Server machine learning goes full throttle on operational data

Artificial intelligence is a hot topic in IT, and Microsoft has made strides to synchronize SQL Server with machine learning ...

## TheServerSide

• ### Does reading XML while writing JSON make me a bad person?

I'm always writing JSON code, loving its flexibility and forgiveness. But when I call an external service, I want to be reading ...

• ### Declarative Jenkins pipeline the latest new feature in DevOps tooling

Looking to get more out of your DevOps tooling? The declarative Jenkins pipeline helps take continuous integration and continuous...

• ### Amazon S3 outage a Fukushima moment for cloud computing

The Amazon S3 outage has turned into the Fukushima moment of cloud computing, as users re-evaluate the cloud's long-term ...

## SearchDataCenter

• ### Manage and optimize IBM z Systems software costs

Some mainframe users find it tough to navigate IBM's sub-capacity pricing model. Consider tools like SCRT to more efficiently ...

• ### Find the right data center cooling systems for hyper-converged

Hyper-converged infrastructure can cause new data center cooling challenges. Before implementation, determine which temperatures ...

• ### IBM's cloud dreams soar on the wings of AI, open source

Hoping to play catch-up with its web services archrivals, IBM has rolled out a raft of products and services fueled by AI and ...

## SearchContentManagement

• ### Seven features to consider when picking enterprise collaboration tools

As collaboration needs grow, more options are beginning to sprout up. Here are the features to think about when selecting the ...

• ### Microsoft Teams joins growing business chat software market

The general release of Microsoft's latest product puts Slack square in its scope, with hopes of taking a chunk from the business ...

Headless CMS can be a difficult pivot for dyed-in-the-wool legacy shops, but remixing content in this new model with RESTful APIs...

## SearchFinancialApplications

• ### Report rates e-sourcing, spend analysis and contract management tools

Gartner Magic Quadrant finds plenty of room for growth in market for cloud-based strategic sourcing application suites that can ...

• ### Benefits administration systems that use analytics liked by employers

HR tech systems are increasingly using analytics in benefits administration, including using claims data; meanwhile, APIs are ...

• ### Degreed integrates and organizes content from online learning software

Atlassian and Intel use cloud-based Degreed to integrate e-books, articles, videos and other content and recommend training for ...

Close