-->

Combining condition from two columns mysql

2019-01-20 01:44发布

问题:

I would like to combine conditions from 2 different columns for my query. This is my original query. You can test it in sqlfiddle.com.

-- creating database first for test data
create table attendance(Id int, DateTime datetime, Door char(20));
INSERT INTO attendance VALUES
(    1,   '2016-01-01 08:00:00',  'In'),
(    2,   '2016-01-01 09:00:00',  'Out'),
(    3,   '2016-01-01 09:15:00',  'In'),
(    4,   '2016-01-01 09:30:00',  'In'),
(    5,   '2016-01-01 10:00:00',  'Out'),
(    6,   '2016-01-01 15:00:00',  'In');

SELECT * FROM attendance;
SELECT 
@id:=@id+1 Id,
MAX(IF(Door = 'In', DateTime, NULL)) `Check In`,
MAX(IF(Door = 'Out', DateTime, NULL)) `Check Out`
FROM
(SELECT 
*, 
CASE 
    WHEN
        (Door != 'Out' AND @last_door = 'Out')
    THEN @group_num:=@group_num+1
    ELSE @group_num END door_group, 
    @last_door:=Door
FROM attendance 
JOIN (SELECT @group_num:=1,@last_door := NULL) a
) t JOIN (SELECT @id:=0) b
GROUP BY t.door_group
HAVING SUM(Door = 'In') > 0 AND SUM(Door = 'Out') > 0;

//output
+------+---------------------+---------------------+
| Id   | Check In            | Check Out           |
+------+---------------------+---------------------+
|    1 | 2016-01-01 08:00:00 | 2016-01-01 09:00:00 |
|    2 | 2016-01-01 09:30:00 | 2016-01-01 10:00:00 |
+------+---------------------+---------------------+

From query above, I would like to add one more column.

-- creating database first for test data
create table attendance(Id int, DateTime datetime, Door char(20), Active_door char(20));
INSERT INTO attendance VALUES
(    1,   '2016-01-01 08:00:00',  'In', ''),
(    2,   '2016-01-01 09:00:00',  'Out', ''),
(    3,   '2016-01-01 09:15:00',  'In', ''),
(    4,   '2016-01-01 09:30:00',  'In', ''),
(    5,   '2016-01-01 09:35:00',  '', 'On'),
(    6,   '2016-01-01 10:00:00',  'Out', ''),
(    7,   '2016-01-01 16:00:00',  '', 'Off');

This is the changes I made to my query but it's not working.

SELECT * FROM attendance;
SELECT 
@id:=@id+1 Id,
MAX(IF(Door = 'In' OR Active_door = "On", DateTime, NULL)) `Check In`,
MAX(IF(Door = 'Out' OR Active_door = "Off", DateTime, NULL)) `Check Out`
FROM
(SELECT 
*, 
CASE 
    WHEN
        ((Door != 'Out' OR Active_door != "Off") AND (@last_door = 'Out' OR  @last_door = 'Off'))
    THEN @group_num:=@group_num+1
    ELSE @group_num END door_group, 
    @last_door:=Door
FROM attendance 
JOIN (SELECT @group_num:=1,@last_door := NULL) a
) t JOIN (SELECT @id:=0) b
GROUP BY t.door_group
HAVING SUM(Door = 'In') > 0 OR SUM(Active_door = 'On') > 0 AND SUM(Door = 'Out') > 0  OR SUM(Active_door = 'Off') > 0;

//output
+------+---------------------+---------------------+
| Id   | Check In            | Check Out           |
+------+---------------------+---------------------+
|    1 | 2016-01-01 08:00:00 | 2016-01-01 09:00:00 |
|    2 | 2016-01-01 09:35:00 | 2016-01-01 10:00:00 |
|    3 | NULL                | 2016-01-01 16:00:00 |
+------+---------------------+---------------------+

//my desire output
+------+---------------------+---------------------+
| Id   | Check In            | Check Out           |
+------+---------------------+---------------------+
|    1 | 2016-01-01 08:00:00 | 2016-01-01 09:00:00 |
|    2 | 2016-01-01 09:35:00 | 2016-01-01 16:00:00 |
+------+---------------------+---------------------+

Please help me guys how can I get the desired output. I would like to get last in and last out either from both columns. Thank you in advance.

回答1:

This strives to keep the solution easily maintainable without finishing the final query all in one shot which would have almost doubled its size (in my mind). This is because the results need to be match and represented on one row with matched In and Out events. So at the end, I use a few worktables. It is implemented in a stored procedure.

The stored procedure uses several variables that are brought in with a cross join. Think of the cross join as just a mechanism to initialize variables. The variables are maintained safely, so I believe, in the spirit of this document often referenced in variable queries. The important parts of the reference are the safe handling of variables on a line forcing their being set prior to other columns using them. This is achieved through the greatest() and least() functions that have higher precedence than variables being set without the use of those functions. Note, too, that coalesce() is often used for the same purpose. If their use seems strange, such as taking the greatest of a number known to be greater than 0, or 0, well that is deliberate. Deliberate in forcing the precedence ordering of variables being set.

The columns in the query named things like dummy2 etc are columns that the output was not used, but they were used to set variables inside of, say, the greatest() or another. This was mentioned above. Output like 7777 was a placeholder in the 3rd slot, as some value was needed for the if() that was used. So ignore all that.

I have included several screen shots of the code as it progressed layer by layer to help you visualize the output. And how these iterations of development are slowly folded into the next phase to expand upon the prior.

I am sure my peers could improve upon this in one query. I could have finished it off that way. But I believe it would have resulted in a confusing mess that would break if touched.

Schema:

create table attendance2(Id int, DateTime datetime, Door char(20), Active_door char(20));
INSERT INTO attendance2 VALUES
(    1,   '2016-01-01 08:00:00',  'In', ''),
(    2,   '2016-01-01 09:00:00',  'Out', ''),
(    3,   '2016-01-01 09:15:00',  'In', ''),
(    4,   '2016-01-01 09:30:00',  'In', ''),
(    5,   '2016-01-01 09:35:00',  '', 'On'),
(    6,   '2016-01-01 10:00:00',  'Out', ''),
(    7,   '2016-01-01 16:00:00',  '', 'Off');

drop table if exists oneLinersDetail;
create table oneLinersDetail
(   -- architect this depending on multi-user concurrency
    id int not null,
    dt datetime not null,
    door int not null,
    grpIn int not null,
    grpInSeq int not null,
    grpOut int not null,
    grpOutSeq int not null
);

drop table if exists oneLinersSummary;
create table oneLinersSummary
(   -- architect this depending on multi-user concurrency
    id int not null,
    grpInSeq int null,
    grpOutSeq int null,
    checkIn datetime null, -- we are hoping in the end it is not null
    checkOut datetime null -- ditto
);

Stored Procedure:

DROP PROCEDURE IF EXISTS fetchOneLiners;
DELIMITER $$
CREATE PROCEDURE fetchOneLiners()
BEGIN
    truncate table oneLinersDetail; -- architect this depending on multi-user concurrency

    insert oneLinersDetail(id,dt,door,grpIn,grpInSeq,grpOut,grpOutSeq)
    select id,dt,door,grpIn,grpInSeq,grpOut,grpOutSeq  
    from 
    (   select id,dt,door, 
        if(@lastEvt!=door and door=1, 
            greatest(@grpIn:=@grpIn+1,0), 
            7777) as dummy2, -- this output column we don't care about (we care about the variable being set)
        if(@lastEvt!=door and door=2, 
            greatest(@grpOut:=@grpOut+1,0), 
            7777) as dummy3, -- this output column we don't care about (we care about the variable being set)
        if (@lastEvt!=door,greatest(@flip:=1,0),least(@flip:=0,1)) as flip, 
        if (door=1 and @flip=1,least(@grpOutSeq:=0,1),7777) as dummy4, 
        if (door=1 and @flip=1,greatest(@grpInSeq:=1,0),7777) as dummy5, 
        if (door=1 and @flip!=1,greatest(@grpInSeq:=@grpInSeq+1,0),7777) as dummy6, 
        if (door=2 and @flip=1,least(@grpInSeq:=0,1),7777) as dummy7, 
        if (door=2 and @flip=1,greatest(@grpOutSeq:=1,0),7777) as dummy8, 
        if (door=2 and @flip!=1,greatest(@grpOutSeq:=@grpOutSeq+1,0),7777) as dummy9, 
        @grpIn as grpIn, 
        @grpInSeq as grpInSeq, 
        @grpOut as grpOut, 
        @grpOutSeq as grpOutSeq, 
        @lastEvt:=door as lastEvt 
        from 
        (   select id,`datetime` as dt, 
            CASE   
                WHEN Door='in' or Active_door='on' THEN 1 
                ELSE 2 
            END as door 
            from attendance2 
            order by id 
        ) xD1 -- derived table #1
        cross join (select @grpIn:=0,@grpInSeq:=0,@grpOut:=0,@grpOutSeq:=0,@lastEvt:=-1,@flip:=0) xParams 
        order by id 
    ) xD2 -- derived table #2
    order by id;
    -- select * from oneLinersDetail;

    truncate table oneLinersSummary;    -- architect this depending on multi-user concurrency

    insert oneLinersSummary (id,grpInSeq,grpOutSeq,checkIn,checkOut)
    select distinct grpIn,null,null,null,null
    from oneLinersDetail
    order by grpIn;

    -- select * from oneLinersSummary;

    update oneLinersSummary ols
    join
    (   select grpIn,max(grpInSeq) m
        from oneLinersDetail
        where door=1
        group by grpIn
    ) d1
    on d1.grpIn=ols.id
    set ols.grpInSeq=d1.m;

    -- select * from oneLinersSummary;

    update oneLinersSummary ols
    join
    (   select grpOut,max(grpOutSeq) m
        from oneLinersDetail
        where door=2
        group by grpOut
    ) d1
    on d1.grpOut=ols.id
    set ols.grpOutSeq=d1.m;

    -- select * from oneLinersSummary;

    update oneLinersSummary ols
    join oneLinersDetail old
    on old.door=1 and old.grpIn=ols.id and old.grpInSeq=ols.grpInSeq
    set ols.checkIn=old.dt;

    -- select * from oneLinersSummary;

    update oneLinersSummary ols
    join oneLinersDetail old
    on old.door=2 and old.grpOut=ols.id and old.grpOutSeq=ols.grpOutSeq
    set ols.checkOut=old.dt;

    -- select * from oneLinersSummary;

    -- dump out the results
    select id,checkIn,checkOut
    from oneLinersSummary
    order by id;
    -- rows are left in those two tables (oneLinersDetail,oneLinersSummary)
END$$
DELIMITER ;

Test:

call fetchOneLiners();
+----+---------------------+---------------------+
| id | checkIn             | checkOut            |
+----+---------------------+---------------------+
|  1 | 2016-01-01 08:00:00 | 2016-01-01 09:00:00 |
|  2 | 2016-01-01 09:35:00 | 2016-01-01 16:00:00 |
+----+---------------------+---------------------+

This is the end of the Answer. The below is for a developer's visualization of the steps that led up to finishing the stored procedure.

Versions of development that led up until the end. Hopefully this assists in the visualization as opposed to merely dropping a medium size confusing chunk of code.

Step A

Step B

Step B output

Step C

Step C output