Recent posts

Recent comments

Archive

Calender

«   2024/11   »
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
[ Javascript ] fromCodePoint & codePointAt

Polyfill

 
<script type="text/javascript">
// source: https://tonks.tistory.com/189#_javascript_fromCodePoint 
( function () { 
    if ( ! String.fromCodePoint ) { 

        var fromCharCode = String.fromCharCode; 
        var floor = Math.floor; 
        var maximum = 0x4000; 

        String.fromCodePoint = function fromCodePoint () { 

                var len = arguments.length; 
                if ( ! len ) {  return "";  } 

                var codeUnits = [ ]; 
                var index = 0; 
                var string = ""; 

                var highSurrogate, lowSurrogate; 

                for ( index ; index < len; index++ ) { 

                        var codePoint = Number( arguments[index] ); 

                        if ( ! isFinite(codePoint) || floor(codePoint) != codePoint || codePoint < 0 || codePoint > 0x10FFFF ) { 
                                var error = new RangeError( "Invalid code point: " + codePoint ); 
                                     error.description = error.name + ": " + error.message; 
                                throw error; 
                        } 

                        if ( codePoint <= 0xFFFF ) codeUnits.push( codePoint ); 
                        else { 
                                codePoint -= 0x10000; 

                                highSurrogate = ( codePoint >> 10 ) + 0xD800; 
                                lowSurrogate = ( codePoint % 0x400 ) + 0xDC00; 

                                codeUnits.push( highSurrogate, lowSurrogate ); 
                        } 

                        if ( index + 1 == len || codeUnits.length > maximum ) { 
                                string += fromCharCode.apply( null, codeUnits ); 
                                codeUnits.length = 0; 
                        } 
                } 

                return string; 
        }; 

        // String.fromCodePoint.toString = function toString () { return "function fromCodePoint () {\n    [native code]\n}" }; 
    } 
}()); 
</script>
 

또는 (Another source)
https://github.com/mathiasbynens/String.fromCodePoint/blob/master/fromcodepoint.js


A : Latin Capital Letter A ( U+0041 )

이름만 거창할 뿐, 위의 문자는 알파벳 A이다. 유니코드 U+0041에 해당된다.
It may seems look a difficult name, but It's just alphabet A.
It's equal to unicode U+0041.

A  A 

<span> &#x41; </span> <span> &#65; </span>
아래 예문을 실행해보면, 모두 똑같은 문자인, A가 나온다.
When you run the example below, all result is the same characters, A.


 
<button onclick="testing()"> Click me </button>

<p id="demo1"> </p>
<p id="demo2"> </p>

<script type="text/javascript">
function testing () { 

    demo1.innerHTML = String.fromCodePoint( 0x41 ); 
    demo2.innerHTML = String.fromCodePoint( 65 ); 
} 
</script>
 
fromCodePoint의 괄호 안에 넣는 것은, 가져올 문자에 대한 유니코드값이다.
16진수인 헥사값을 넣어도 되고, 정수로 넣어도 된다.

The number that inputted in bracket of fromCodePoint(), is a unicode number of the character.
You can input the hex number or integer of unicode value.

숫자 65를 16진수로 바꾸면 0x41이 나온다. ( 0x + 헥사값 )
It's 0x41 that a number 65 is converted to hex. ( 0x + hex number )


참고로, 현재까지 유니코드의 총 갯수는 1,114,111( 0x10FFFF )이다.

음수나 소수 또는 0x10FFFF보다 더 큰 숫자를 넣게 되면, 에러가 나온다.
물론, 숫자가 아닌 것을 넣었을 때에도 마찬가지이다.

Up to now, the total number of Unicode is 1,114,111( 0x10FFFF ).

And if you input a number with decimal point or greater than 0x10FFFF, or a negative number,
an error occurrs.
Of course, when inputting thing that is not a number, an error occurrs also.

아래처럼 정수로 바꿀 수 있는 문자라면 괜찮다.
If it is a string that can be converted to a integer, as below,
it is okay.

function testing () { 

    demo1.innerHTML = String.fromCodePoint(  " 0x41 "  ); 
    demo2.innerHTML = String.fromCodePoint(  " 65 "  ); 
} 


 
<button onclick="testing()"> Click me </button>

<p id="charLength"> </p>
<p id="codePoint"> </p>
<p id="charCode"> </p>

<script type="text/javascript">
function testing () { 

    var txt = String.fromCodePoint( 65 ); 

    charLength.innerHTML = txt.length; 

    codePoint.innerHTML = txt.codePointAt( 0 ); 

    charCode.innerHTML = txt.charCodeAt( 0 );
} 
</script>
 


𝔸 : Mathematical Double-struck Capital A ( U+1D538 )

유니코드 U+1D538에 해당하는 문자이다.
html 문서 안에 입력하는 방법은 아래와 같다.

It is the character that equal to Unicode U+1D538.
How to input in html document is as follows.

𝔸  𝔸 

<span> &#x1d538; </span> <span> &#120120; </span>

 
<button onclick="testing()"> Click me </button>

<p id="demo1"> </p>
<p id="demo2"> </p>
<p id="demo3"> </p>

<script type="text/javascript">
function testing () { 

    demo1.innerHTML = String.fromCodePoint( 120120 ); 
    demo2.innerHTML = String.fromCodePoint( 0x1d538 ); 
    demo3.innerHTML = String.fromCodePoint( 0xd835, 0xdd38 ); 
} 
</script>
 

분명 하나의 글자이나, 글자 길이는 2라고 나온다.
Actually, it is only one character, but the length is 2.

 
<button onclick="testing()"> Click me </button>

<p id="charLength"> </p>
<p id="codePoint"> </p>
<p id="charCode"> </p>

<script type="text/javascript">
function testing () { 

    var txt = String.fromCodePoint( 120120 ); 

    charLength.innerHTML = txt.length; 

    codePoint.innerHTML = txt.codePointAt( 0 ); 

    charCode.innerHTML = txt.charCodeAt( 0 );
} 
</script>
 


q̣̇ : q (U+0071) +   ̇ (U+0307) +   ̣ (U+0323)

소문자 q의 위아래로 발음 구별용 기호 두 개가 들어간 형태이며,
글자의 길이는 3으로 나온다.

It's lowercase q that two diacritical marks were used at the top and bottom.
And the length is 3.

<span>q&#x307;&#x323;</span> <span>q&#775;&#803;</span> 

 
<button onclick="testing()"> Click me </button>

<p id="demo1"> </p>
<p id="demo2"> </p>

<script type="text/javascript">
function testing () { 

    demo1.innerHTML = String.fromCodePoint( 0x71, 0x307, 0x323 ); 
    demo2.innerHTML = String.fromCodePoint( 113, 775, 803 ); 
} 
</script>
 

 
<button onclick="testing()"> Click me </button>

<p id="charLength"> </p>
<p id="codePoint"> </p>
<p id="charCode"> </p>

<script type="text/javascript">
function testing () { 

    var txt = String.fromCodePoint( 113, 775, 803 ); 

    charLength.innerHTML = txt.length; 

    codePoint.innerHTML = txt.codePointAt( 0 ); 

    charCode.innerHTML = txt.charCodeAt( 0 );
} 
</script>
 


결론적으로 말하자면,
기존의 fromCharCode나 charCodeAt으로는 각 문자에 대한 정확한 유니코드값을 가져올 수 없어서,
fromCodePoint와 codePointAt 이라는 함수가 새로 추가되었다고 보면 된다.
하지만... 이 두 함수로도 문자열의 정확한 길이값은 가져오지 못한다.

To conclude,
Existing functions fromCharCode & charCodeAt cannot get the correct Unicode value for each character,
therefore, it seems that the functions fromCodePoint and codePointAt have been added.
But... All of these four functions cannot get the exact length value of the string.



이 내용이 도움이 되셨다면, 아래의 하트 버튼을 눌러주세요 *^^*
If this article is helpful to you, please click the heart button below. :)