-->

Oracle USING clause best practice

2020-08-20 04:21发布

问题:

Disclaimer: I'm a developer and not a DBA.

I've been a huge fan of the USING clause in Oracle since I accidentally stumbled upon it and have used it in place of the old-fashioned ON clause to join fact tables with dimension tables ever since. To me, it creates a much more succinct SQL and produces a more concise result set with no unnecessary duplicated columns.

However, I was asked yesterday by a colleague to convert all my USING clauses into ONs. I will check with him and ask him what his reasons are. He works much more closely with the database than I do, so I assume he has some good reasons.

I have not heard back from him (we work in different timezones), but I wonder if there are any guidelines or best practices regarding the use of the "using" clause? I've googled around quite a bit, but have not come across anything definitive. In fact, I've not even even a good debate anywhere.

Can someone shed some light on this? Or provide a link to a good discussion on the topic?

Thank you!

回答1:

You're presumably already aware of the distinction, but from the documentation:

ON condition Use the ON clause to specify a join condition. Doing so lets you specify join conditions separate from any search or filter conditions in the WHERE clause.

USING (column) When you are specifying an equijoin of columns that have the same name in both tables, the USING column clause indicates the columns to be used. You can use this clause only if the join columns in both tables have the same name. Within this clause, do not qualify the column name with a table name or table alias.

So these would be equivalent:

select e.ename, d.dname
from emp e join dept d using (deptno);

select e.ename, d.dname
from emp e join dept d on d.deptno = e.deptno;

To a large extent which you use is a matter of style, but there are (at least) two situations where you can't use using: (a) when the column names are not the same in the two tables, and (b) when you want to use the joining column:

select e.ename, d.dname, d.deptno
from emp e join dept d using(deptno);

select e.ename, d.dname, d.deptno
                         *
ERROR at line 1:
ORA-25154: column part of USING clause cannot have qualifier

You can of course just leave off the qualifier and select ..., deptno, as long as you don't have another table with the same column that isn't joined using it:

select e.ename, d.dname, deptno
from emp e join dept d using (deptno) join mytab m using (empno);

select e.ename, d.dname, deptno
                         *
ERROR at line 1:
ORA-00918: column ambiguously defined

In that case you can only select the qualified m.deptno. (OK, this is rather contrived...).

The main reason I can see for avoiding using is just consistency; since you sometimes can't use it, occasionally switching to on for those situations might be a bit jarring. But again that's more about style than any deep technical reason.

Perhaps your colleague is simply imposing (or suggesting) coding standards, but only they will know that. It also isn't quite clear if you're being asked to change some new code you've written that is going through review, or old code. If it's the latter then regardless of the reasons for them preferring on, I think you'd need to get a separate justification for modifying proven code, as there's a risk of introducing new problems even when the modified code is retested - quite apart from the cost/effort involved in the rework and retesting.

A couple of things strike me about your question though. Firstly you describes the on syntax as 'old-fashioned', but I don't think that's fair - both are valid and current (as of SQL:2011 I think, but citation needed!). And this:

produces a more concise result set with no unnecessary duplicated columns.

... which I think suggests you're using select *, otherwise you would just select one of the values, albeit with a couple of extra characters for the qualifier. Using select * is generally considered bad practice (here for example) for anything other than ad hoc queries and some subqueries.



回答2:

Related question.

It seems the main difference is syntactic: the columns are merged in a USING join.

In all cases this means that you can't access the value of a joined column from a specific table, in effect some SQL will not compile, for example:

SQL> WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual),
  2       v AS (SELECT 1 a, 2 b, 3 c FROM dual)
  3  SELECT t.* FROM t JOIN v USING (a);

SELECT t.* FROM t JOIN v USING (a)
         ^    
ORA-25154: column part of USING clause cannot have qualifier

In an outer join this means you can't access the outer table value:

SQL> WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual),
  2       v AS (SELECT NULL a, 2 b, 3 c FROM dual)
  3  SELECT * FROM t LEFT JOIN v USING (a)
  4   WHERE v.a IS NULL;

 WHERE v.a IS NULL
         ^
ORA-25154: column part of USING clause cannot have qualifier

This means that there is no equivalent for this anti-join syntax with the USING clause:

SQL> WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual),
  2       v AS (SELECT NULL a, 2 b, 3 c FROM dual)
  3  SELECT * FROM t LEFT JOIN v ON v.a = t.a
  4   WHERE v.a IS NULL;

         A          B          C A          B          C
---------- ---------- ---------- - ---------- ----------
         1          2          3  

Apart from this, I'm not aware of any difference once the SQL is valid.

However, since it seems this syntax is less commonly used, I wouldn't be surprised if there were specific bugs that affect only the USING clause, especially in early versions where ANSI SQL was introduced. I haven't found anything on MOS that could confirm this, partly because the USING word is ubiquitous in bug descriptions.

If the reason for not using this feature is because of bugs, it seems to me the burden of the proof lies with your colleague: the bugs must be referenced/documented, so that the ban can eventually be lifted once the bugs are patched (database upgrade...).

If the reason is cosmetic or part of a coding convention, surely it must be documented too.



回答3:

With USING you also cannot do a join like: select a.id,aval,bval,cval from a left join b on a.id = b.id left join c on c.id = b.id;

that is, only give the column from C when it is matched to a row in the B table.