x86 - How do I enable SSE for my freestanding bootable code? -


(this question cvtsi2sd instruction , fact thought didn't work on pentium m cpu, in fact it's because i'm using custom os , need manually enable sse.)

i have pentium m cpu , custom os far used no sse instructions, need use them.

trying execute sse instruction results in interruption 6, illegal opcode (which in linux cause sigill, isn't linux), referred in intel architectures software developer's manual (which refer on iasdm) #ud - invalid opcode (undefined opcode).

edit: peter cordes identified right cause, , pointed me solution, resume below:

if you're running ancient os doesn't support saving xmm regs on context switches, sse-enabling bit in 1 of machine control registers won't set.

indeed, iasdm mentions this:

if operating system did not provide adequate system level support sse, executing sse or sse2 instructions can generate #ud.

peter cordes pointed me sse osdev wiki, describes how enable sse writing both cr0 , cr4 control registers:

clear cr0.em bit (bit 2) [ cr0 &= ~(1 << 2) ] set cr0.mp bit (bit 1) [ cr0 |= (1 << 1) ] set cr4.osfxsr bit (bit 9) [ cr4 |= (1 << 9) ] set cr4.osxmmexcpt bit (bit 10) [ cr4 |= (1 << 10) ] 

note that, in order able write these registers, if in protected mode, need in privilege level 0. the answer question explains how test it: if in protected mode, is, when bit 0 (pe) in cr0 set 1, can test bits 0 , 1 cs selector, should both 0.

finally, custom os must handle xmm registers during context switches, saving , restoring them when necessary.

if you're running ancient or custom os doesn't support saving xmm regs on context switches, won't have set sse-enabling bits in machine control registers. in case instructions touch xmm regs fault.

took me sec find, http://wiki.osdev.org/sse explains how alter cr0 , cr4 allow sse instructions run without #ud.

my first thought on old version of question might have compiled program -mavx, -march=sandybridge or equivalent, causing compiler emit vex-encoded version of everything.

cvtsi2sd   xmm1, xmm2/m32         ; sse2 vcvtsi2sd  xmm1, xmm2, xmm3/m32   ; avx 

see https://stackoverflow.com/tags/x86/info links, including intel's insn set ref manual.


Comments

Popular posts from this blog

Fail to load namespace Spring Security http://www.springframework.org/security/tags -

sql - MySQL query optimization using coalesce -

unity3d - Unity local avoidance in user created world -